Rabbit Hole #1

AI and Big Data: How to use machine learning to know your audience?


We live in a world full of data. Data brings information, knowledge, and wisdom, as presented by the DIKW pyramids. Based on large sets of data, we can add value and enrich it through productive analysis, and finally, we can get wisdom. With technology developing, getting trends and insights from unorganized raw data becomes easier and easier for individuals and organizations. Today, most for-profit companies collect data and use analytical tools to target customers, reduce costs, and make optimal choices. As nonprofits, arts organizations also rely on databases to collect audience information and get a better understanding of them. People are still exploring ways to apply new technologies to analyze data precisely. As one of the hottest emerging concepts at the moment, AI can do more than simulate human behavior. In the article, we will explore how to integrate machine learning, a type of artificial intelligence that has strong analytical and forecasting skills, into art organizations, better “learn” from your data of audience and achieve their mission.

DIKW Pyramid, Source: Ontotext

What is machine learning?

Machine learning is a branch of artificial intelligence that applies more to the data analysis field. One of its core concepts is to use machines to learn from data and make predictions. It is a way to “use computers to answer questions”. The computer will learn from a training dataset, which is the historical data we already had, build models and find patterns, and use the model in the test dataset to evaluate if it fits the data. For example, we have a dataset of three different species of iris and their features of sepal length and sepal width. We can use this dataset to let computers get the correlation model between the features of sepal and species. By observing the sepal width and length, the computer can help us to identify the species if we get a new flower. We may simply find the pattern and classify the species ourselves, but sometimes we may not have the ability to handle data with many features and seemingly no pattern, so machine learning can be helpful.

Three Iris species in Iris Dataset, Source: medium

Machine learning is commonly used in fields of image and video recognition, prediction, and personalized push notification on social media platforms. Companies mainly apply this method to build price strategies by mining historical price data and making customer segments. In the arts and entertainment industry, it is also useful in marketing research. It would be helpful to get insights into who our customers are, why they choose us and how they will be more satisfied and loyal in any organization.

Data in art organizations

Arts organizations already find benefits in using data-driven methods to engage the audience. Around 90% of nonprofits are collecting data but half of them are not sure how to use it according to The State of Data in the Nonprofit Sector report. 85% of respondents in the Nonprofit Trend report say they “use insights from marketing and engagement data to target outreach efforts. Organizations are used to applying data tools to collect customer data and build stronger relationships. It is easy to find the information of ticket buyers such as email, address, what kinds of performances they like, or how much they spend on your organization’s events. And that helps organizations to make customer segments, know preferences, and reach out to them directly. Based on a research of nonprofits, the most common use of audience data is using the contact details to send out newsletters. And less than 40% of the respondents use the audience data to personalize campaigns and inform the process of creating artworks.

The major finding of nonprofits collecting data, Source: The State of Data in the Nonprofit Sector
How nonprofits use audience data, Source: Arts&metrics

Machine learning in audience analysis

Art organizations should analyze data more deeply. With new technologies, organizations can get data that is bigger and multiple forms and find better insights in audience analysis. Machine learning makes it possible to find patterns in disorganized audience behavior data. We will mainly focus on several applications this technology can bring to art organizations, build better relationships, make recommendations and help improve the services and productions. 

Profile your customers for a better relationship

Customer Relationship Management(CRM) is software to manage all the relationships in a company. Arts organizations are benefiting from it by managing all the disparate activities such as ticketing, fundraising, and data reporting, which makes it easier to target the audience. For example, it allows organizations to turn the transaction record to the relationship. You can easily find the first-time buyers and send them emails about the upcoming event to improve their return rates.

Machine learning offers more possibilities for CRM. It can leverage the data from “what” to “why” based on the CRM platform. Besides using the sales data to predict future profits, it is also useful in profiling your audience. Accurate customer segments bring better connections. Integrating the big data and machine learning method into CRM enables more types of data to the platform and plays an important role in understanding audience behaviors. By mining deeply the historical ticketing sales data, it helps to find more patterns in buyers’ habits and predict the probabilities of whether they will return. 

Organizations can use the results for more accurate segmentation and send different newsletters to audiences with different return rates or buying habits, which enables them to make deeper connections and improve their loyalty. For unstructured data and free text fields, It is difficult to extract common points with traditional statistical methods. However, it is possible to set an algorithm to find out any inquiries, complaints, and references for specific shows. It is also useful in fundraising. Donors are giving because they want the organizations to meet their needs. By getting an accurate profile, we can know more about donors’ interests and explore how the organizations can introduce programs to donors and help them to pursue their needs. It helps to cultivate long-term relationships and encourage giving.

The arts industry has already applied it in customer segmentation. Purple Seven is the leading theater and art data analytics company to use big data to help find insights for organizations. By combining an organization’s CRM with external big data of performing art consumption, Purple Seven can identify which bookers have the greatest likelihood of returning, what their audience is interested in, how to reach them, and where to find new audiences. 

Purple Seven, Source: Purple Seven Website

Make personalized recommendations

Machine learning’s application in CRM makes customization possible. A personalized recommendation system is a technique to provide recommendations based on historical behavior data. It is a common application by using big data to analyze user behavior, particularly for online streams like YouTube and Netflix. Netflix uses the watch history of other users who have similar tastes to recommend what you may be most interested in watching next. It excels in analyzing audience profiles and pushing personalized products. To maximize the satisfaction of viewers with different preferences, it even cut out several different versions of the trailer and distributed them to different people through the user profile mastered by big data. The benefit is also significant. Netflix Over 75% of its viewer activity is based on personalized recommendations. They earned over 1 billion due to the recommendation system accounting for over 80% of the content streamed on the platform. 

Different types of trailers on Netflix, Source:

Unlike the entertainment industry, traditional art organizations focus on off-line events. Currently, the personalized recommendation system is used for galleries. There is software using big data and machine learning to predict the artworks’ price for customers and make personalized recommendations, such as Arternal, Artbase, and Artcloud. Although few recommendation systems are widely used in the theater industry, data scientists are making models to gain insights into the artistic preferences of customers and similarities between performances to make personalized recommendations.

Arternal Website, Source: Arternal

Improve content and service

Organizations want to know not only why the audience chooses us, but also their whole experience, and which parts or details raise their interests or make them satisfied. Analyzing audience behaviors in the events can help us achieve this goal. For the online platform, the user’s clicks, length of stay, and comments will be analyzed by algorithms to find out the pattern of which types of plots are popular. The investment of productions is also based on the prediction of the main characters and plots. The millions of visits per day can bring a large number of data samples to the platform to support the analysis and prediction, to provide the audience with more desired works. 

Museums use a similar method in exploring how the visitors interact with the venue. The British Museum uses machine learning to collect data of visitor experience. They can find out how people experience its exhibitions: what routes they take, what they engage with, how many minutes they take at each installation, and which pieces they choose to ignore. By doing this, the museum can get the point of interest of the audience, know what is working and what is not.  It can be more targeted to making the promotion and event and exhibition design in the future. For example, if more people stay a long time during similar topic paintings, it may be interesting to consider an education tour for these artworks. It helps to increase visitation, harness social outcomes and deliver efficiencies.

The British Museum, Source: Microsoft

Theaters can hardly get the audience’s review during the events, but show lovers love to post their feelings on social media. Theater can get a view of what the audience likes and dislikes about the show by analyzing the unstructured comments on the internet.


After introducing the applications, we found that their commonality is to analyze the behavior of the audience, discover the characteristics of the audience, and use this to engage more audiences, build loyalty and expand the impact of the organization. That is why we use data analytics tools. We want to find a measurable way to know our audience, to discover who they are, why they are here and what they want, and broaden it and connect with it in innovative ways. Arts organizations always value audience feedback, but previously we could only do this using common sense, observation, and interviews of individuals. Now machine gives us a chance to get more comprehensive information and discover hidden insights from a large amount of data. With machine learning, it is easier to get a clear profile of the audience and their story with your organizations and help improve the organization and build deeper relationships in a targeted way. A better understanding of your audience is also a better understanding of your organization. For small art organizations, there may be barriers such as being hard to get a big data sample. Although the pattern will be more accurate with more data, it will still be useful to know more about your audience. 

It is the trend in the world to use big data. Art organizations will not miss the chance to be involved in this trend for an efficient way to make connections with audiences. Arts organizations will gradually discover the importance of data and get used to using analytical tools to discover insights and understand the audience.


Abernethy, Jacob D., Cyrus Anderson, Alex Chojnacki, Chengyu Dai, John Dryden, Eric M. Schwartz, Wenbo Shen, Jonathan C. Stroud, Laura Burdick, Sheng Yang and Daniel T. Zhang. “Data Science in Service of Performing Arts: Applying Machine Learning to Predicting Audience Preferences.” ArXiv abs/1611.05788 (2016): n. Pag.

“Art Gallery Software Market 2021 High Growth Forecast Due to Rising Demand and Future Trends.” ZNews Africa, February 25, 2022.

Carlsson, Rebecca. “Big Data and Museums.” MuseumNext, March 30, 2021.

Day, Adrienne. “Data-Driven Connections for a Better World (SSIR).” Using Data and Technology to Create World-Changing Connections Between Nonprofits and Their Supporters, 2020.

“How CRM Can Help You Outperform National Arts Industry Revenue Benchmarks.” NAMP, May 15, 2019.

“How Netflix Used Big Data and Analytics to Generate Billions.” Selerity, September 27, 2021.

Midura, Danine. “Leveraging Machine Learning Using CRM Data.” Technology Advisors, July 14, 2021.

“New Audience.” Purple Seven. Accessed March 2, 2022. New Audience. 2022. Ebook. Purple Seven.

Sauravdeb. “Introduction to Machine Learning: Iris Dataset.” Medium. Medium, February 18, 2022.

“The British Museum Is Using Big Data to Help Visitors Learn More about History.” Microsoft News Centre UK, July 4, 2017.

“The National Gallery Predicts the Future with Artificial Intelligence.” Digital meets Culture, September 14, 2017.

“The State of Data in the Nonprofit Sector.” everyaction. Accessed March 2, 2022.

“Uses of Machine Learning: List of Top 10 Uses of Machine Learning.” EDUCBA, March 2, 2021.

Villaespesa, Elena. “Digital Culture 2014 – How Arts Organisations Use Audience Data.” arts&metrcs, December 8, 2014.

“What Is the Data, Information, Knowledge, Wisdom (DIKW) Pyramid?” Ontotext, October 22, 2020.

Yu, Allen. “How Netflix Uses AI and Machine Learning.” Medium. Becoming Human: Artificial Intelligence Magazine, October 1, 2019.

Leave a Reply