When you were surfing through the web, have you ever encountered an advertisement that seemed quite relevant to you? And while using video websites like YouTube or Netflix, have you ever seen the section where it shows recommended videos for you? If that’s the case, you’ve probably wondered about the algorithms behind this phenomenon. Did you know that there’s actually a sophisticated level of Artificial Intelligence used in this technology? In this article, you will find out the basic principles of AI’s use on the Internet.
Definition of Big Data and its Use
In order for the AI to work properly on advertising specific information, it needs data. Lots of it. And thankfully, we are living in a society where technology helps us to collect it easily. The term for this technology is “Big Data.” So, what exactly does Big Data mean and how does it work? Put simply, Big Data is a compilation of gigantic data sets that can be as big as petabytes. Can you guess the size of a petabyte? It's the size of a million gigabytes combined. Considering the fact that a two-hour-long BluRay film is about three to five gigabytes, there is no doubt that a petabyte seems like a cutting-edge amount of superdata. By using this huge amount of data and applying some inductive statistics, analysts can predict people’s behavior on certain subjects. Now, let’s take a look at some of the examples of Big Data’s usage.
Web Advertising through Big Data
Let’s begin by talking about web advertisements that utilize Big Data. As I’ve mentioned before, relevant advertisements are quite common these days. Why? This is because of efficiency. Through customized advertising techniques, advertisers can boost the sale of their products while cutting the cost. The specific parts of this Big Data mechanism may vary depending on the advertising companies, but the main idea is similar: analyzing customers’ behavior patterns to show them the products that they are most likely be interested in. The analyzing strategy is quite complicated, but it can be simplified in two steps in general: firstly, analysts analyze users on certain web sites and figure out their preferences. After they’ve figured out the users’ patterns and preferences, they show users the customized advertisements.
Here is a familiar example of Big Data marketing: Facebook. Marketers who want to advertise on Facebook can choose their target very specifically. They can pick their advertisement targets’ location, age, gender and interests. Also, the content that the target user and his friends “liked” will be involved in the ad mechanism. An interesting part of Facebook’s Big Data system is that not only advertisers but also users can pick what ads they want to see or avoid. For instance, if a user is interested in seeing food ads, he or she can choose to see them in the user’s advertisement setting. And if not, a user can simply remove them from the setting to stop getting them.
Big Data Played a Crucial Role on Obama’s Victories
Digital marketing is not only limited to commercial uses. It has also been used in political situations. In the United States presidential elections that were held in 2008 and 2012, former US president Barack Obama used Big Data to win the election. Analyzing voters’ wealth, race, households, educational backgrounds and political orientations, he used several strategies to effectively run his election campaign. Of all the strategies that he used during the campaign, a strategy that stands out the most is the customized email campaign. For moderates, he studied how much they could be persuaded and sent personalized emails to people who he thought he could change their minds to support him. For people who were interested in dogs, he sent emails about his own dog. Also, for people who cared about environmental issues, he wrote about his policies regarding the environment. On the other hand, to solidify support and unity among his followers, he delivered them emails recommending they join local community clubs that consist of people with similar perspectives. Examples of local community clubs include “Asians who Support Obama” and “Parents who are Concerned about Childcare.”
Big Data on YouTube and Netflix
Video websites are also one of the places where Big Data is used most. While watching a video on YouTube, you will see that there’s an “Up next” section where it guides you to relevant videos on the right side of the page. How does it work? There are two steps. The first step is called “candidate generation.” In this stage the AI selects videos that were watched by users who have similar tastes with the viewer. The taste of the viewers is studied in a couple of ways, such as explicit feedback*, which is giving a thumbs up or down to the videos, and user subscription records. The next step is called the “Ranking system.” This phase is needed to line up the videos that were sorted out from the first step of the process. How does it rank the videos? The scoring algorithm in the ranking system depends on the expected watch time*. This expected time is calculated by the amount of videos the user has watched in the same channel, the keywords that the user has used to find that video and how long the user has watched a video in the same genre. The ways YouTube uses to calculate watch time are so articulate that they even give us goosebumps. The reason why YouTube uses such a complex two-step recommendation system lies in the fact that its site’s scale is so huge. Since thousands of videos are uploaded rapidly, YouTube might recommend completely irrelevant videos without its elaborate Big Data system. Along with YouTube, Netflix has quite a similar recommendation mechanism. Using the watch history* and rating systems, it decides on which films to recommend. It even shows users the “match percentage,” which indicates how likely they will enjoy that film. The match score depends on the rating system, so the more the user rates the videos, the more accurate the match will be.
Drawbacks of Big Data
After reading the merits brought by Big Data such as efficiency, you may think that there exists only the good side of it. Unfortunately, that is not the case. Big Data also has a dark side, and it may work as a double-edged sword. The biggest disadvantage of using Big Data is that our information may be spread anywhere. As we use the Internet, we spill a huge amount of our personal data on various websites. Yes, even if we our provide personal data freely, our valuable information may be safe if controlled and used only for good deeds. But what if it goes to the wrong hands, such as to hackers or criminals? What if they hack our information and threaten us with it? Another critical problem is that Big Data can be used to manipulate people. According to psychological theories, humans are bound to be more positive on things they see more often. Manipulated by the trick of customized content, people may feel like they are in favor of exposed content without even knowing it. As you can imagine, the issues concerning Big Data are big.
Big Data — a Double-edged Sword
Despite the dangers that Big Data may bring to us, the future of Big Data is bright. Analysts say that the demand for Big Data technology and service market will keep increasing. Its market is expected to reach $60.9 billion by 2020, whereas it was about $43 billion in 2017. Although the fruit of Big Data may be sweet right now, we will have to remain neutral to prevent it from rotting.
*explicit feedback — feedback that is readily observable
*expected watch time — the time that the user is probably going to watch that video for
*watch history — the records of one’s watched videos
Kim Junhwan DogeDoggo@naver.com
<저작권자 © 홍익대영자신문사, 무단 전재 및 재배포 금지>