Can Metadata Predict Your Personality?
What is Metadata
We have learned that metadata is descriptive information that is used to help identify or describe data. It is much like the cover of a book. It gives the details needed to discern the contents (data) of the book and identify information that can be used to identify or categorize the book within a library. Metadata follows a similar logic; it is the information used to describe a digital asset.
Metadata is crucial to managing large amounts of digital information. Terms or data points are associated with a digital file to describe it. This could include details such as the date the file was created, file name, type of file, subject matter, or author. Certain types of digital assets will also include information that includes GPS location. Metadata helps organize your digital library, searching for specific assets with a group of objects possible. In other words, metadata is the breadcrumb trail that leads you to the desired information.
There are three main categories of metadata, which are used most often. These categories help define the type of information that is being searched for.
- Descriptive Metadata – This is where you will find terms that describe the digital asset. This can include keywords, file name, or author. These are the terms generally used to search for a specific file.
- Structural Metadata – This is additional information that will help you recognize more files associated with the asset. It can reveal if it is an individual file or part of a collection.
- Administrative Metadata – This includes necessary details as the type of file or date created.
Data Hidden in Metadata
Metadata is the information behind the scenes. It is used in every type of industry. With today’s technology, nearly everything we do in life can leave a metadata trail. It can include details on how well you slept last night from your Fitbit. It can tell when you made your morning coffee or even washed your last load of laundry with smart appliances. That favorite stop for lunch produces metadata when you use your credit card. The weekly trip to the gas station leaves a trail of data too. All your online activities, text messages, images taken and sent, or even documents you produce include metadata. Every day, you leave a trail behind you. It isn’t about the one piece of data but the accumulated mass of metadata that could be put together to map out your very existence.
Making Predictions with Metadata
What can be done with vast amounts of metadata? One area of study in analytics that is growing with the use of metadata is predictions. Sophisticated applications using artificial intelligence and machine learning can categorize metadata and intuitively understand its connections to other data pieces. When data is collected over time, the application can make predictions, such as that used in predictive text technology. Google uses this concept to show you relevant listings and ads. Predicting what the user wants is a common theme in today’s technology.
When you use an application such as Netflix, you leave behind a trail of movies. Netflix keeps track of the movies you watch. To suggest movies, the application needs to know more than just that it is a video file and its title. The metadata includes information about the type of film – action, sci-fi, drama, or documentary. If you repeatedly watch movies that are thrillers or action movies, it can be easy to predict that you might be interested when a new action film is added. The metadata held in music files is how companies like Spotify predict new music they think you will enjoy.
The collection of metadata by government agencies has been ongoing for some time. The NSA, FBI, DEA, and others use metadata to solve crime and fight terrorism. With the predictive nature of metadata and the extensive trail that we leave behind about our lives, NSA can now use your metadata trail to predict your personality accurately.
MIT Media Lab researchers have found several uses for using metadata to predict personality patterns. The metadata can describe how you use your phone, how you prefer to communicate, to whom, and from where. The accumulation of data patterns can be a predictor or indicator of your personality.
The researchers used the answers to a questionnaire from 100 students to determine their personality type, amongst five types. These are five comparable types to the Five-Factor Model of Personality, widely used in psychological studies.
- Neurotic: Highly emotional or noted as a higher-than-normal tendency to experience unpleasant emotions.
- Open: Inquisitive, broadly curious, and creative.
- Extroverted: One who looks toward others for stimulation and acceptance.
- Agreeable: Friendly, warm, compassionate, and cooperative.
- Conscientiousness: These individuals are no-nonsense, self-disciplined, organized, and hard-working.
The researchers then studied the metadata from their phone records for 18 months. There were specific metadata sets that the researchers were interested in. These were:
- Primary phone use – which would include details such as the number of calls, time of day, and length of the call.
- Active user behaviors – this was noted in the number of calls initiated by the user and how attentive they were with the time it took the subject to answer a text.
- Location or movement – details the number of places where calls have been made and other indicators of the so-called ‘radius of gyration.’
- Regularity of calling routine – did the user call at certain times of the day or had regular contact with individuals.
- Diversity – this was noted as a math equation that took the ratio “between the subject’s total number of contacts and the relative frequency at which he or she interacts with them.”
By using machine learning, they were able to assign a personality type to certain behaviors. One of the researchers, DeMontjoye, stated how they came to their conclusions: “We let the algorithm determine the right mix. Each indicator is useful but is conditional on all the other indicators. That doesn’t mean each one is causal or that people who travel more are neurotic. Let’s say that the relationships between A and B are not linear; if you do a linear progression, you see no relationship; you do a quadratic progression, you do see how A can predict B.”
An essential point that DeMontjoye made regarding the public’s perception of metadata was eye-opening. “We see a lot of comments along the lines of ‘It’s only metadata. It’s not personal. And it only gets personal when humans look into it. We wanted to show an example at a small scale of what you might be able to do” with that data on how long calls last when they are made, and where.
Who Can See Metadata?
It may be easy to dismiss the NSA’s collection of metadata by saying, “What do I have to hide?” The truth is everything. There is no current legislation in the United States regarding the use of metadata. You may think it’s okay for the government to collect it; after all, it is for national security. However, we are misled by the previous US government decision that “metadata is not data,” as the ruling declared in section 702 of the FISA Act. Obviously, there is a ton of information in the metadata.
It’s essential to be concerned about the type of metadata you produce because anyone can view it. It’s easy enough to view that your 12-year-old neighbor could do it. Imagine that if you can check an image file that belongs to you for its EXIF data (a type of image metadata), then it is only reasonable to assume if you post that photo or release it in any way that anyone else can view the metadata as well.
It may be a good idea to review your safe computing practices to remove metadata from your files. It is possible to check the settings on various applications that you use and reduce, remove and set parameters for the collection and addition of metadata. It’s a good idea to keep in mind when you post images at work, at home, or on your phone. This applies to all types of files, documents, videos, and others. Most software packages used to create a file of any kind have built-in formats for adding metadata to your file. You wouldn’t want to give out your home address randomly across the internet, but the photos on your Facebook page may be doing it for you.