1. Introduction

These guidelines describe how to rate Search Ads shown in the iOS App Store. For this task, you will rate the relevance of these ads to user queries. The information produced will be used for evaluation purposes only.

Ads for Apps and Games

When a user searches for an app or game in the App Store, they’ll see a list of results. At the top of that list, there will often be an advertisement for a different app or game. These ads are highlighted in blue and have the word “ad” embedded in them. Under the ad will be a list of results for the search they did.

Why Show Ads?

Ads help users discover apps and games related to their interests that they might not have known about or searched for.

What Makes an Ad Relevant?

An ad’s relevance is based on its relationship to the user intent, based on their search query.

Ads can be relevant to user intent in several different ways. They can be exactly what the user was looking for, but they can also be something that the intent implied by the user’s query suggests they might be interested in.

Remember: An ad for an app or game does not have to be for, or be about, exactly what the user asked for to be relevant.

2. Rating Interface

On the left-hand side of the rating tool interface, you will see the user search query. Under the query are links you can use to help begin your research on the query and what it means.

Under these links is another link to the App Store search results for the query. This will show you the results a search for the query at the App Store would give.

Remember to look at the search results only and not the ads (if any) that appear at the top of the result page.

In the center of the tool is the result ad to be rated. The ad shows you the name and icon of the advertised app, plus:

  • Subtitle (a brief summary of what the app does)
  • Review score and number of reviews
  • Developer (the company or person that created the app)

For more information about the advertised app, including a detailed description and user reviews, click the icon or the name for a link to the App Store Preview page.

Based on this information, you will make two ratings:

• Query Type • Ad Relevance

There is also a comment box where you can provide brief notes on how you chose your ratings. Comments should include:

  • The reasoning you chose your ratings
  • References or links to any research you did to understand the query or the result

3. Query Intent

It’s essential to first understand what the user was looking for when they entered their search term. This section will explain how to interpret user queries, and how to answer the Query Type question.

Research

Many queries will require some web research to understand the user’s intent, especially if they refer to an app or kind of app that you’re not familiar with. Usually the best way to research a query is to type it into a search engine and add words like “app”, “game”, “iphone” etc. if the original query isn’t specific enough. Use the search engine links in the tool to start your search. If you have access to an iPhone or iPad, you can also try searching the App Store directly.

Along with App Store Preview pages for a specific app, web searches will often return news articles and other information about the queried app or type of app. You can use this information to get an idea of the prominence of the apps involved and the audience they serve, which can be taken into account when rating the result ad.

Research can also be helpful when a query is ambiguous, looks like nonsense, or does not appear to be about an app. The query “monkey,“ for instance, actually refers to a video chat app.

Where a query could describe a generic class of apps but also matches the name of a specific app or game, consider the prominence of the specific app versus the commonness of the app category. You can get a good idea for how popular the app is from the number of user reviews.

For example, the query “school driving” could be an oddly phrased search for driving instruction apps, but a web search for the query returns a highly ranked result for the driving simulation game School Driving 3D, which has over a thousand reviews, and is a more likely intent.

You can also use the link to the App Store search results for the query. This will show you the results a search for the query at the App Store would give.

Use these results as an additional research tool to help you understand what the intent of the query could be. They may reveal intents you had previously dismissed or had not considered.

Multiple Interpretations

Queries can be ambiguous, with multiple likely interpretations.

The query “coco” is likely to refer to the social networking app Coco, which has over 700 reviews and ranks very highly in search engine queries for “coco app”. On the other hand, it might also refer to the children’s games from developer Coco Play, with names like Coco Wedding and Coco Ice Princess, many of which have thousands of reviews.

A particularly confusing type of query just names a real-world activity, interest or profession:

  • “doctor”
  • “gymnastics”
  • “makeover”
  • “makeup”

At first glance, these might look like app queries – searching for virtual makeup apps like Perfect365, or floor routine trackers for gymnastics. However, there’s an equally high probability that these queries were actually written by children looking for doctor-themed games like Doctor X: Med School, or gymnastics games like Gymnastics Superstar.

When rating result ads for these queries, consider a result for either interpretation equally likely.

Spelling

Most queries will be fully typed, but some may be unfinished, abbreviated, or contain spelling errors. In these cases, rate as if the query was typed correctly, as long as its original intent is reasonably clear. For example, the query “tumb” is highly likely to refer to Tumblr, while “fb” is likely to mean Facebook.

You will also see some very long queries that seem unlikely to have been typed by a real user, for example the query “ricky carmichael’s motocross matchup pro.” This is usually due to the App Store having offered the user an autocompleting query suggestion for this query, and shouldn’t influence your rating.

Foreign Languages

Sometimes, queries will use a language or script that isn’t commonly used in your market. In these cases, use online translation services to understand the query as well as you can.

Short Forms, Abbreviations, and Slang

Not every user will type in the full or proper name of the app or game they’re looking for. In Thailand “ไอจี”is a short form for Instagram, while Canadians looking for the Tim Horton’s coffee shop app may enter the query “timmie’s.” The game Clash of Clans may be referred to as “CoC” and “ARTS games” are action real-time strategy games. Always research any query you’re not sure of and consider the slang and short forms used in your locale.

Japan storefront – Unexpected Spaces

If you see unexpected usage of spaces, such as breaking a word into individual characters, for Japan (JP) storefront, please ignore these spaces and rate the query relevance to the Ad shown accordingly. These will not affect search ad results, so please treat them as if there are no spaces.

A couple examples for this can be: Query:ばうんてぃ(Bounty)=ばうんてぃ(Bounty) Query: 漫     画            (Man ga) = 漫画 (Manga)

Before rating the ad, you must select the Query Type. This requires you to understand whether the user had a particular app in mind or was searching for any app that fulfills their needs.

Pick one of the following:

  1. Navigational – the user is looking for a specific app, series of apps (for example the Candy Crush games), or developer.
  2. Functional – the user is looking for a category of app or an app that performs a particular function.
  3. Mixed – the query could be interpreted as both navigational and functional.
  4. Unclear – you can’t tell what the user is looking for.

Here are some examples of where you should select Navigational for Query Type:

QueryTypeExplanation
meet meNavigationalUser is looking for the social networking app MeetMe.
candy crushNavigationalUser is looking for Candy Crush Saga, or other games in that series.
pixle gunNavigationalThough the word is misspelled, user intent is for Pixel Gun 3D.
ytNavigationalThis is a common abbreviation for YouTube.
bank of americaNavigationalLooking for the Bank of America Mobile Banking app, or other apps from Bank of America.

Note: Queries that could be looking for one of multiple unrelated apps that share the same name should still be considered Navigational.

For example, the query “v4” could refer to a fighting game or a bicycle navigation app, both named “V4”. That query should be rated Navigational, even though it’s not clear which of the two apps the user was looking for.

Here are examples of cases where you should use Functional. Note that this covers a wide range of queries, from naming a very simple, broad category to highly specific requests:

QueryTypeExplanation
strategy gamesFunctionalUser is looking for any game in the very broad “strategy” genre.
coloring book for adultsFunctionalLooking for any coloring book apps, not a specific one.
instagram followersFunctionalUser is searching for apps to find who follows them on Instagram.
learn englishFunctionalUser looking for an app to help them learn English. They have not named a specific app.

Use Mixed when the query could be for either a specific app, or an app in that category with a name that matches the string.

QueryTypeExplanation
transitMixedThis could be a search for any kind of public transit app, but it could also be referring to the popular app named Transit, which shows up at the top of web or app store searches for “transit app”.
wifi analyserMixedThis could be referring to any apps that analyse WiFi signal, but web research shows it’s also the name of a popular app for other platforms.
formula 1MixedThe user might be looking for the official F1 app, any third-party apps about F1, or any games themed around Formula 1.

Some queries will refer to a specific real-world entity that might have apps associated with it, but one that doesn’t correspond directly to one particular app. This might be:

Since those queries are in a sense both navigational – referring to a particular entity – and functional – searching for any app associated with them – choose Mixed for Query Type.

Use Unclear when the query is impossible to understand. This can include nonsense or incomplete queries where there’s not enough information to get any idea of what the user is looking for.

Note: Queries that can have more than one meaning, like “opera,” which could refer to the browser or the performance art are not Unclear. They may have more than one possible intent, but can easily be understood.

QueryTypeExplanation
fixUnclearThis query appears incomplete, and it’s not clear what the user wants.
basketball people jerseyUnclearUser might be interested in something related to basketball jerseys, but it’s not clear what.

When you select Unclear, you will not need to rate the Relevance question.

Apps not in the iOS App Store

Some navigational queries will refer to apps or games that are not in the iOS App Store. For example, the following queries would not return the result the user was interested in:

QueryIntended ResultNotes
sketchSketch AppA popular design app that doesn’t currently have an iOS version.
youtvplayerYou TV PlayerA video streaming app only available for jailbroken iPhones.
league of legendsLeague of LegendsA PC/Mac game. Similar games like Mobile Legends are available on iOS, but not the original.
minecraft bed wars freeBedWars | HypixelThis is a community-developed mod for the game Minecraft, so no specific results will be returned that match it.

In these cases, answer Navigational for Query Type, as the user’s intent was still to find that specific app. Note in your comment that the app wasn’t available in the iOS App Store.

4.1. Relevance

After answering the query level question, you will use your understanding of the query to rate the relevance of the returned ad. This section will walk you through how to answer this question, focusing first on queries for apps and then on ones for games.

4.1. Relevance Question

The Relevance question has four available rating options:

  1. Excellent – the app has a strong relationship to the query intent and is among the most likely to be of interest to the user.
  2. Good – the app has some relation to the query intent, and is quite likely to be of interest to the user, but other apps might be more compelling.
  3. Acceptable – the app has a slight relation to the query intent, and the user would not be surprised to see it as an ad, but they aren’t that likely to be interested.
  4. Bad – the app doesn’t have any relation to the user intent, and/or seems likely to surprise the user and make their experience worse. Note: If you choose this rating, you must leave a comment explaining why.

Note: Don’t demote apps for few or low review scores, high prices, or poorly written descriptions.

Apps or games should also not be demoted because they that cost money, even if the query is for a free app or includes the word “free.”

4.2. App Queries

When the query intent is for an app, the relevance of an ad should be rated based on how it relates to the user intent conveyed in the query, based on the functionality of the app and the audience it seems to target.

When is an app ad relevant?

A query often suggests a range of multiple possible interests or needs that could be satisfied by an app. For every query, think about where the result ad falls in this range.

For example:

  • A query for a gym app shows the user has a definite interest in apps for working out in a gym, and less strongly implies a broader interest in apps for other fitness activities, such as running
  • A query for a dog training app shows the user has a clear interest in apps for dog obedience, a strongly implied potential interest in dog-related apps overall, and a weaker potential interest in pet-related apps that have some functionality of use to dog owners
  • A query for a calculator app shows a strong interest in or need to do math, which could be for professional activities, school homework, or something else. In this case, it’s not possible to infer very much else about this user beyond that interest in calculating something.

App ads that should be rated Excellent have functionality matching the most obvious interests or needs, as those are ads the user is highly likely to interact with. Those relating to less strongly implied intents should be Good, as they are still useful as recommendations, helping to expose the user to new apps that they might be interested in.

When is an app ad not relevant?

If an app ad is on the relevance borderline, where a user is likely to recognize that the functionality it provides could be relevant to their interests or needs, but the link is weak, choose Acceptable.

Reserve the Bad rating option for apps that seem like they would negatively catch the attention of a user expecting the ad to be relevant. This can include the following scenarios:

IssueExampleNotes
Ad app has no perceivable link to the user’s interests or needsQuery: a banking app Ad: a birdwatching appBanking and birdwatching have nothing to do with one another. Rate Bad.
Ad app is technically linked to user intent, but it seems illogical to advertise based on that linkQuery: a grocery shopping app Ad: a clothes shopping appBoth involve shopping, but for completely different things and with very different audiences. Rate Bad.
Query: a document to PDF scanner Ad: a virus scanner appBoth involve some kind of scanning, but not in a way that makes one of interest to users of the other. Rate Bad.
Intent is in the same broad category as advertised app, but they fit into distinctly different niches that don’t relate to each otherQuery: a dog training app Ad: a cat training appBoth are related to pets, but there’s no evidence to assume this user has cats as well as dogs. Rate Bad.
Query: a dictionary app Ad: an app designed to teach young children wordsThough the user shows a need to look up words, there is no basis to assume they want to do so for a child’s education. Rate Bad.

App Similarity

When the query intent is navigational and the result ad seems to serve the same purpose as the intent, consider the similarity of the queried app and the result app when deciding how likely the app is to be of interest to the user.

Ads for apps that seem to be competing for the same users by offering a similar feature set should be considered highly likely to interest the user, and be rated Excellent.

Examples of pairs of competing apps include:

(Users may use both Uber and Lyft in order to catch the nearest car, or both WhatsApp and Facebook Messenger to talk to different friends, but not because the apps do anything fundamentally different).

Apps that have functional overlap but some considerable difference between them are potentially less likely to interest the user, for example:

Depending on how similar you find those feature sets, you might rate pairs of apps such as these either Good or Acceptable.

Local Knowledge

Always consider a query’s meaning in the context of the locale where it is being rated.

For instance, in most English-speaking countries, the user intent for the query “bbc” is most likely apps from the British Broadcasting Company, implying an interest in news, entertainment etc. For those markets, apps for learning a language would not be relevant. But in some Chinese-speaking markets, watching and listening to the BBC is a popular way to learn English. Many users in these markets would see a clear relationship between a query for the BBC and an ad for a language learning app.

Rating Examples

The following sections will take you through examples of how to apply these concepts.

Excellent App Examples

These ads are for apps that seem to directly match the user intent, and are rated Excellent:

Direct Competitor

When the query is navigational and the result is a direct competitor of the app the user wanted to find, rate Excellent:

Same Developer

Note that If the query is the name of a developer and the ad returned is from that developer, the rating should be Excellent.

The developer’s name can be found in the ad or on the App Store page.

Apps from the same developer as the queried app that don’t otherwise match the themes or functionality of that app should be rated no lower than Acceptable, as they can be automatically considered to be at least slightly relevant to that query:

Good App Examples

Downgrade to Good for results that are still close enough to the query intent to be logical and potentially of interest to the user, but that seem less strongly linked to the query.

The weakness might be in the app’s function, themes, and/or perceived target audience.

Accessory Apps

Ads for accessory apps, which provide some kind of add-on or extra functionality to the app the user was searching for, should also be rated Good:

Note: When rating app ads that require a particular device, assume that the user does own one of these devices.

Acceptable App Examples

Rate Acceptable for apps that have a slight relation to the query intent, and that users would not be surprised or offended to see as ads, but that they aren’t that likely to be interested in.

Note: If an ad is for an app or game that’s relevant to the query but not available in the test locale, rate it Acceptable.

Bad App Examples

Rate Bad when you can’t see any relationship between the query and the ad, or when the functionality of the intent and ad are so mismatched as to be potentially embarrassing.

Offensive Results

Rate Bad for any cases where the ad returned is for something that doesn’t match the query and could be offensive to the user, given that they didn’t ask to see it:

4.3. Game Queries

When the user intent is for games, rather than apps, your considerations should be somewhat different. A user looking for games is not looking for something to serve a particular purpose, but for something to entertain them.

The relevance of a game ad should be based on factors like:

  • Play style – what sort of experience does this game provide? What are the skills it requires, or activities it requires the player to do? For example, a game might test a player’s reflexes, or their problem-solving capabilities, or their ability to plan and act strategically
  • Presentation – what does the game look like? What sort of themes does it center around? For example, a game might be colorful and cartoony, or dark and horror-themed, or gritty and realistic
  • Audience – who is this game for? Does it seem to target particular users? For example, a game could be targeted at children, or sports fans, or animal lovers

The more appealing a user might find the game offered in the ad based on their query, the better its rating should be.

Reserve the Bad rating for games that stand out as being completely and/or jarringly different from the user intent. For example:

  • A puzzle game that requires slow careful thought, versus a action game that rewards quick reactions
  • A game themed around World War II featuring realistic weapons, versus a cartoonish farm building game
  • A game themed around colorful cartoon characters intended for small children, vs a shooting game intended for more mature users

Game Query Rating Examples

The following sections will take you through examples of how to apply these concepts.

Excellent Game Examples

Rate Excellent when the game offers all, or nearly all, of what the functional query was looking for or when the named game and the advertised game share enough of the same play style and themes that they would appeal to the same users.

Good Game Examples

Downgrade to Good for games that are close enough to the intent or play style and presentation suggested by the query to be logical and potentially of interest to the user, but that seem less strongly linked to the query.

The weakness might be in the game’s play style, themes, and/or perceived target audience.

Ads for accessory apps, which provide some kind of add-on or extra functionality to the game the user was searching for, should also be rated Good:

Acceptable Game Examples

Rate Acceptable for games that have a slight relation to the query intent or named game, and that users would not be surprised or offended to see as ads, but that they aren’t that likely to be interested in.

If the query specifies that the app should be “new” or “popular”, those that are older or less popular will not be as relevant to the user. Web research, along with the app’s Version History link, can provide more information about an app’s age.

Bad Game Examples

Rate Bad for any cases where the game returned is for something that doesn’t match the query in any wr the presentation of or audience for of the intent and ad seem so mismatched as to be potentially embarrassing.

5. Rating Overviews

The examples in this section will demonstrate a variety of ratings for the same query.

5.1.      Navigational Queries

Users with Navigational queries are looking for a specific app, series of apps (for example the Candy Crush games), or developer. Apps that exactly match the query intent or are its direct competitors will be the most satisfying, but apps that relate to the intention expressed in the query can also have some degree of relevance.

5.2.      Functional Queries

Users with functional queries are looking for a category of app or an app that performs a particular function. The rating you give the app will depend on how well the app matches the category or performs the function.

Remember: An app does not have to be in the exact category or perform the exact function requested to be of interest to a user and have relevance.

                   

5.3.      Mixed Queries

A mixed query could be interpreted as both navigational and functional. The rating you give will depend on how closely the app matches one or both possible intents and how interesting it would be to a user with the query.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post