Working with Persons in Wikibase

Adding persons as named entities to Wikidata or a Wikibase

Persons

Agents

  • In this tutorial we show how to create entries for Agents, first only as natural persons. Agents have the capacity to engage in activities (processes or events) that creates things. Our examples are taken from music, so we add composers, performers and producers to a knowledge base.
  • We have a similar tutorial to for getting tarted with Things in the online presentation Music Entities↗, for example, adding a musical work to a composer, or live and recorded performances (sound recording) to musicians and their groups.
  • Adding other Agents, such as photographers creating photographs in their ateliers, companies releasing tons of greenhouse gases, pathogenes causing known diseases, or natural dyes colouring archaic textiles can be added similarly.
  • We will create a similar presentation on activities (synonymous with events of processes). These are more advanced topics because recording the time and place component can be rather complicated (e.g., locating an archaeological finding below the surface or an atelier within a building that changed street address many times over the course of history).

Data Model Basics

Recording information on how persons and other Agents (corporate bodies, legal persons) participate in Activities (processes, events) to create or change Things (like literary or music works, dresses, houses.)

Data & Metadata Provenance

A generalised data model about the relationships between ⏩ Activities (such as creation, recording, …), the ⏩ Agents involved in these activities (performers, record labels, …) resulting in abstract ⏩ Things (Music Entities) like musical works, recording, concerts…

Music Entities (Things)

Music entities Description
musical work a novel way of playing music, and potentially accompanying it with lyrics; when registered, a sound recording or a sheet of the music is provided for future identification, which is matched with the author’s data and connected to an ISWC identifier.
music sheet a form of recording the work for future performance; when a music publisher makes it available, it receives an ISMN identifier (older sheets have ISBN book identifiers.)
audio recording fixation of sounds [ISRC standard]
music video recording fixation of sounds synchronized with pictures or moving pictures where (a) the fixed sounds are wholly or substantially a musical performance or (b) the recording (3.3) is intended for viewing in association with a recording of a musical performance. [ISRC standard]

Music Agents (Persons)

Agent Description
author The author of music or lyrics. Entitled to authors’ right or copyright.
composer The author of a musical work. Entitled to authors’ right or copyright.
lyricist The author of the text of a musical work, or a literary text that is arranged together with a musical work. Entitled to authors’ right or copyright.
performer The performer of a musical work; in case of a sound recording, the performer whose performance is fixed in the recording. They may be entitled to neighbouring or sound recording copyrights.
producer The person or legal entity that produces the recorded fixation of the sound recording. They are entitled to neighbouring or sound recording copyrights.
👉🏿 Agents create Things (like a musical work), and they participate in ⏩ Activities taking place at a location and at a given time (like a sound recording process.)

Music Agents (Corporate Bodies)

Agent Description
rights management (organisations) agents managing the rights on behalf of rights owners. It can be companies whose sole purpose is to ensure that content that has been licensed has delivered royalties that are identified and accounted for. The role can be taken by collective management organisations or by private companies on behalf of songwriters, composers, performers, music publishers, or record labels.
musical group or band a group of agents (music performers) acting together to make live performances (concerts) or sound recordings; usually they have no legal personality, and membership may change frequently.
👉🏻 agents are often legal persons (for example, music publishing companies) acting on behalf of human agents; in music creation, they can also be software agents (tools that aid performance or notation.)

Music Activities

Activities Description
creation process a process to create a novel piece of work or lyrics, results in a musical work or literary work for text only.
registration process a process to establish the identity of a new work, the rightsholders, and to give it a unique identifier, such as an ISWC for works, or ISRC for recordings, or ISNI/VIAF for newly published authors.
music notation results in a instructions on how to play the musical work, a music sheet or a machine-readable MIDI or music XML file.
publishing process it makes available, usually against royalty payment, a manifestation of the musical work.
️ 👉🏾 Activities (synonyms with Process or Events) take place in time and at a location with an active involvement of Agents and result in Things, like a musical work, new rights attached to the work, or a sound recording. As nodes they connect 🔙 Agents and 🔙 Music Entities (Things).

Metadata as Statment

“Data is only potential information, raw and unprocessed, prior to anyone actually being informed by it. […] Data must be understood not as an abstract concept but as objects that are potentially informative. […] Metadata Is a Statement about a Potentially Informative Object.” (Pomerantz 2015, p26)

  • metadata is information added to data about how we should understand, use, process the data itself.
  • a statement is a fundamental knowledge carrier, connecting a Ⓢsubject via a 🅿️predicate to an 🅾️object.
  • a simple statement connects a person to her name: ⓈTaylor Swift 🅿️has the birth name 🅾️Taylor Alison Swift.
  • metadata statements connect protected works with persons: ⓈGold Rush🅿️has a co-author🅾️Taylor Alison Swift.
  • ⓈGold Rush (work) 🅿️was recorded by 🅾️Taylor Swift which is identical to ⓈTaylor Swift 🅿️recorded 🅾️Gold Rush (work)

Gold Rush: Statements in AI

spotify:3i19sqYCrCy8RK4qc8hlCg by Emma Stevens was identically titled to spotify:5BK0uqwY9DNfZ630STAEaq by Taylor Swift or spotify:6x9VaGUbiSRvkLEdjeqjsN by the group called Death Cab for Cutie: Gold Rush. Resolving such ambiguities is time consuming for human workers, and usually it is impossible with a single database.

Named Entity Recognition and Disambiguation (NERD) is an essential function in rights management or research. Deduction or inference engines based on an explicit knowledge bases (that can connect various databases, too) can compare billions of statements in minutes, potentially resolving thousands of ambiguous entity statements or (legal, scientific) claims.

Wikibase Data Model

The Wikibase Data Model makes it easy to connect your knowledge into a graph even without software tools. It can also synchronize with proprietary, in-house databases and open knowledge graphs of the world’s great libraries, archives, universities and other knowledge institutions.

Reprexbase is based on Wikibase.

Working with Persons as Agents

Add a new person

Create a new person item ⇗
  • The person Zoltán Kodály is not the same thing as the name Zoltán Kodály. There may be many persons who use this full name.

  • To avoid confusion, we use unique identifiers for each person (later slide.)

  • Because in human language we use names, it is important to attach the name(s) of a person to the Wikibase entry of the person: to add Béla and Bartók to the modern composer as a person.

  • For adding a new agent to our knowledge base or a data sharing space, you need to fill out at least in one language (English) a label and a description for each Agent.
  1. Log into a Wikibase instance, for example, to this wikibase instance⇗.

  2. Choose Special pages⇗ (You can find it on the sidebar).

  3. Press Create new item⇗.

  4. Fill in English label with the English (Wester) name of the person, (f.e. Taylor Swift) into the description field provide a short description of the person, (f.e., American singer-songwriter), and potential aliases, for example, birth name (f.e., Taylor Alison Swift.) Use the aliases only for name variants.

Define the new person

Define the new person with a classification and if possible, with an equivalent Wikidata QID.

Define the new person with a QID. Distinguish Zoltán Kodály↗ as a person from Zoltán Kodály as a name.
  1. Scroll to the Statements part below the definition box of the person, for example, Wikidata URI (item) (P73)⇗.

  2. Press +add statement.

  3. The subject of the statement is the person whose page your are editing (for example,Zoltán Kodály (Q296)⇗).

  4. Scroll down and choose a property, or add directly the property number. The predicate about the person is always connecting via a property item, for example, Wikidata URI (item) (P73)⇗, or [instance of (P2)]https://reprexbase.eu/demowiki/index.php?title=Property:P2), or given name (P71)⇗. The actual links and the P numbers are different in each Wikibase instance.

  5. If the person has a Wikidata page, the first statement that you enter should be the Wikidata URI (item) (P73)⇗, for Zoltán Kodály this is https://www.wikidata.org/wiki/Q153008.

  1. The second statement should be a class; you should be as specific as you can be with confidence. Is Zoltán Kodály a human↗? A living or deceased person↗? A music professional⇗? More specifically a composer↗? Is composer too specific? In this case, it is too specific, because Zoltán Kodály performed many music-related activities.

  2. Our Reprexbase instances use the same classification as Wikidata when we believe that it is useful. For example, Wikidata uses dead human↗, but our preferred label is deceased person. Or we use a music professional↗ term for people who work professionally with music, but not necessarily in an artistic role (i.e., tour managers, sound engeinners qualify, too.)

  3. You can add several instance of statements, for example, if your Wikibase already defined it, you can call Zoltán Kodály both a composer↗ and a music educator↗, but not as an author name string↗ (Our music educator definition is more refined than Wikidata’s, but they can be treated equal for data linking.)

Add Name

Add author name string wd:P2093↗ or on ReprexBase demo instance P230⇗
  1. Add the name as a string, for example, for authors, using the author name string (P2093)↗ property. The author name string property is used to mass-import names from library catalogues.

  2. The problem with name strings is that they may have several (spelling) versions, different name orders in various languages, and they may have spelling errors.

  3. Whenever possible, we use controlled names. This means that after typing Zoltán Kodály, you try to add Zoltán as a given name (P735)↗ property (see next steps.)

  1. Add the name as a string, for example, for authors, using the author name string (P2093)↗ (P230)⇗ property. The author name string property is used to mass-import names from library catalogues.

  2. Type the name, for example, Zoltán Kodály. If the main language of the Wikibase is English, use the English (Western) name order.

  3. You can skip this step if you find both the given name and the family name in the instance as a controlled entity (item). (See next slide.)

Manage Given Name(s)

Ty to add controlled given names(s)

If you do not find the controlled name in the knowledge base, enter it as a string.
  1. Try to use the given name (P735)↗ (P71)⇗ property and find the name as an entity (item) in your Wikibase. If the given name is present, it contains knowledge about archaic spellings, spelling variations, and other information about its correct use. A given name is more useful than the author name string, because it makes it clear that Zoltán is a given name (in the Hungarian language, it can be a family name, too.)

  2. If you do not find the given name as a pre-defined entity (item with a Q number), you can add it as a given name string. A given name string is still more useful than the author name string, because it makes it clear that Katarína is part of the given name(s) of the artist. In our example, we added Katarína with Slovak spelling to the entry of Katarína Kubošiová (Katarzia)↗-Katarína Kubošiová (Katarzia)⇗. You can add several given names, if necessary.

Manage Family Name(s)

Whenever you can, you should work with controlled family names.

If you do not find the controlled familly name in the knowledge base, enter it as a family name string.
  1. Try to use the family name (P734)↗ property and find the name as an entity (item) in your Wikibase. If the family name is present in this form, it may contain knowledge about archaic spellings, spelling variations, and other information about its correct use. A family name is more useful than the author name string, because it makes it clear that Kodály↗ is a family name.

  2. If you do not find the given name as a pre-defined entity (item with a Q number), you can add it as a family name string. A family name string is still more useful than the author name string, because it makes it clear that Kodály is part of the family name(s) of the artist. In our example, we added Kodály as a string and Kodály↗ as an item. It is not a problem if you add both. (It is likely that the Kodály string is added first and then it is matched with Kodály↗.)

  3. Beware that Kodály↗ is not a family: it is a family name. Many unrelated people may use the same family name (or surname). Family relations must be entered in a different way (not part of this tutorial.)

Identifiers

Many people may have the same name, but only one of them should have the same authority file or PID identifier.
  • Many people may have the same name, but only one of them should have the same authority file or PID identifier.

  • We must use globally unique identifiers to connect data, information or knowledge about the same persons in different databases.

  • Our preferred identifiers as VIAF to connect library catalogues, ISNI for an ISO standard name identification, and the Wikidata QID, because it allows us to connect to many more identifiers and data sources.

  • Tip: you can add multiple identifiers with starting several +add statement fields.

  1. Under the definition of the person select +add statement.

2 Select the VIAF ID (P214)↗- VIAF ID (P13)⇗ property and/or the ISNI (P213)↗ - ISNI (P226)⇗. The equivalent properties have different numbers in each Wikibase instance.

  1. Add the URIs in canonical form to the the statement, for example, https://viaf.org/viaf/89006617/ or https://isni.org/isni/0000000121429277.

  2. You have probably done this earlier, because the Wikidata URI (item) (P73)⇗ is a similar global identifier. Whenever possible, we add it as a first statement. (It has no Wikidata equivalent↗, it helps to connect us to the Wikidata equivalent.)

  3. Examples: Taylor Swift (Q58)⇗, Taylor Swift (Q26876)↗ Béla Bartók (Q294)⇗, Béla Bartók (Q83326)↗. The equivalent entity items have different numbers in each Wikibase instance.

Enter new language labels

Add new language via Set Item/Property label, description and aliases.

Add the label and the description in the new language. You can add new aliases, too.

Now you have language-specific labels for Zoltán Kodály (the person).
  1. Press on the sidebar or on the Spacial pages Set Item/Property label, description and aliases ⇗.

  2. Add the QID of the item where you want to enter a new language. For Zoltán Kodály, this is Q296. You can see it from the URL of the item: https://reprexbase.eu/demowiki/index.php?title=Item:Q296 (ending with Item:Q296.)

  3. Select the two-letter language code of the next language.

  4. Add the label and the description in the new language. You may add language-specific aliases.

Birth name

Add the birth name (you must define the language of the birth name, too).
  • The full name of a person at birth can be different from their current, generally used name.

  • Birth names often change when people get married and start a new family under a new family name.

  • Artists often choose a variation of their name that identifies them better. There may be many reasons why a person or agent has a different name at birth (or creation) than later.

  • The birth name is a name string in the spelling and name order of the (original) language.

  1. Under the definition of the person select +add statement.

2 Select the birth name (Wikidata)↗ property - in our Wikibase birth name (P299)⇗.

  1. Type in the birth name as a string, and select the language.

  2. Examples: Taylor Swift (Q58)⇗, Taylor Swift (Q26876)↗ in English and Zoltán Kodály (Q296)⇗ with a Hungarian birth name.