Scraping Mac App Store reviews

You have some products on the Mac App Store, you would like to know what people think of them but you can’t bother checking for reviews one store at a time?

Don’t worry, you’re not alone. I can’t understand why Apple made it so difficult to check reviews for your own products in iTunes Connect, so here’s a little PHP script that will do the dirty work for you: it’ll check all the stores and fetch all the reviews, producing an XML file per product with the most important info (review, reviewer, version of the product…).

You just need to rename the config file to macappstore.cfg.php, fill in the $products array, create an output subdirectory and execute:

1
php macappstore.php

After a while you’ll see the XML appearing in the output directory, ready for further processing.

24 comments

  1. Nice script… it worked! Very slow because of all the retrieval, but that’s OK… I just let it cook until it’s done.

    I added some code at the end to display the results as HTML. Like this:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
                foreach( $items as $id => $item ) {
                    $doc->startElement( 'review' );
                    $doc->writeAttribute( 'id', $id );
                   
                    foreach( $item as $key => $val )
                        $doc->writeElement( $key, $val );
                   
                    echo "<p><b>{$products[$prId]} &mdash; {$item['date']} &mdash; Version {$item['version']} &mdash; $store</b>";
                    echo "<br><i>{$item['rating']} &mdash; {$item['title']}</i> ({$item['reviewer']})";
                    echo "<blockquote>{$item['body']}</blockquote>";
                   
                    $doc->endElement();
                }
  2. Bob says:

    Here it is:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
        $appStores = array(
    'Algeria' => 143563,
    'Angola' => 143564,
    'Anguilla' => 143538,
    'Antigua & Barbuda' => 143540,
    'Argentina' => 143505,
    'Armenia' => 143524,
    'Australia' => 143460,
    'Austria' => 143445,
    'Azerbaijan' => 143568,
    'Bahrain' => 143559,
    'Bangladesh' => 143490,
    'Barbados' => 143541,
    'Belarus' => 143565,
    'Belgium' => 143446,
    'Belize' => 143555,
    'Bermuda' => 143542,
    'Bolivia' => 143556,
    'Botswana' => 143525,
    'Brazil' => 143503,
    'British Virgin Islands' => 143543,
    'Brunei' => 143560,
    'Bulgaria' => 143526,
    'Canada' => 143455,
    'Cayman Islands' => 143544,
    'Chile' => 143483,
    'China' => 143465,
    'Colombia' => 143501,
    'Costa Rica' => 143495,
    'Cote D\'Ivoire' => 143527,
    'Croatia' => 143494,
    'Cyprus' => 143557,
    'Czech Republic' => 143489,
    'Denmark' => 143458,
    'Dominica' => 143545,
    'Dominican Rep.' => 143508,
    'Ecuador' => 143509,
    'Egypt' => 143516,
    'El Salvador' => 143506,
    'Estonia' => 143518,
    'Finland' => 143447,
    'France' => 143442,
    'Germany' => 143443,
    'Ghana' => 143573,
    'Greece' => 143448,
    'Grenada' => 143546,
    'Guatemala' => 143504,
    'Guyana' => 143553,
    'Honduras' => 143510,
    'Hong Kong' => 143463,
    'Hungary' => 143482,
    'Iceland' => 143558,
    'India' => 143467,
    'Indonesia' => 143476,
    'Ireland' => 143449,
    'Israel' => 143491,
    'Italy' => 143450,
    'Jamaica' => 143511,
    'Japan' => 143462,
    'Jordan' => 143528,
    'Kazakstan' => 143517,
    'Kenya' => 143529,
    'Korea, Republic Of' => 143466,
    'Kuwait' => 143493,
    'Latvia' => 143519,
    'Lebanon' => 143497,
    'Liechtenstein' => 143522,
    'Lithuania' => 143520,
    'Luxembourg' => 143451,
    'Macau' => 143515,
    'Macedonia' => 143530,
    'Madagascar' => 143531,
    'Malaysia' => 143473,
    'Maldives' => 143488,
    'Mali' => 143532,
    'Malta' => 143521,
    'Mauritius' => 143533,
    'Mexico' => 143468,
    'Moldova, Republic Of' => 143523,
    'Montserrat' => 143547,
    'Nepal' => 143484,
    'Netherlands' => 143452,
    'New Zealand' => 143461,
    'Nicaragua' => 143512,
    'Niger' => 143534,
    'Nigeria' => 143561,
    'Norway' => 143457,
    'Oman' => 143562,
    'Pakistan' => 143477,
    'Panama' => 143485,
    'Paraguay' => 143513,
    'Peru' => 143507,
    'Philippines' => 143474,
    'Poland' => 143478,
    'Portugal' => 143453,
    'Qatar' => 143498,
    'Romania' => 143487,
    'Russia' => 143469,
    'Saudi Arabia' => 143479,
    'Senegal' => 143535,
    'Serbia' => 143500,
    'Singapore' => 143464,
    'Slovakia' => 143496,
    'Slovenia' => 143499,
    'South Africa' => 143472,
    'Spain' => 143454,
    'Sri Lanka' => 143486,
    'St. Kitts & Nevis' => 143548,
    'St. Lucia' => 143549,
    'St. Vincent & The Grenadines' => 143550,
    'Suriname' => 143554,
    'Sweden' => 143456,
    'Switzerland' => 143459,
    'Taiwan' => 143470,
    'Tanzania' => 143572,
    'Thailand' => 143475,
    'The Bahamas' => 143539,
    'Trinidad & Tobago' => 143551,
    'Tunisia' => 143536,
    'Turkey' => 143480,
    'Turks & Caicos' => 143552,
    'Uganda' => 143537,
    'UK' => 143444,
    'Ukraine' => 143492,
    'United Arab Emirates' => 143481,
    'Uruguay' => 143514,
    'USA' => 143441,
    'Uzbekistan' => 143566,
    'Venezuela' => 143502,
    'Vietnam' => 143471,
    'Yemen' => 143571
        );
  3. Bob says:

    Some of the stores returns reviews from the US-Store:

    Bangladesh -> USA
    Cote D’Ivoire -> USA
    Liechtenstein -> USA
    Maldives -> USA
    Nepal -> USA
    Serbia -> USA
    Ukraine -> USA

    They shoud be removed from the list to prevent duplicates.

    • Simone says:

      I suspect that Apple wouldn’t like it 😉

      If they wanted users to get reviews easily, they would provide a clean API to do it. An app that scrapes their iTunes web site would pass their review process only by mistake, IMHO.

  4. Will Mavis says:

    This is awesome!

    Am I completely missing something? Does Apple provide any way for developers to see reviews of their apps (other than through iTunes or the website)? I’ve ignored reviews until now. I finally got time to go investigate and I couldn’t find anything. Your script was exactly what I needed. Thanks for saving me a lot of time!

    Sadly, I also discovered by using your script, that each store has it’s own reviews. I only ever looked at the US reviews! It makes me wonder what else I’m missing.

    I am going to modify the code a little on mine to pull back some other info (rankings, etc). Would you like me to pass my changes back to you to add to the master version?

    I’m also going to create a php script and xsl template to allow me to quickly navigate around to each of my apps and view them as HTML in a browser. I will share those as well if others are interested.

    Thanks again! I’m so happy!!

    • Simone says:

      Apple allows you to check reviews via iTunes Connect, too. But they only show reviews for a prodcut at a time and you need to pick the store you want to see the reviews for, making it a pain.

      Improvements to the script are always appreciated 🙂

  5. 4levels says:

    Great script, works like a charm!

    I noticed a small bug (or change in the iTunes format) causing the date to show only the year.

    If you add the following lines, you’ll be all set to go!

    1
    2
    3
    4
    5
    6
    7
    8
     ...
    216      $version = trim($matches[1]);
    217      $matches = explode('-', $info);
    ->      $year = array_pop($matches);
    ->      $month = array_pop($matches);
    ->      $day = array_pop($matches);
    ->      $date = trim($day.'-'.$month.'-'.$year);
    ...
    • Simone says:

      I think that the date format is not the same across all the stores. I did notice that problem some time ago, but honestly I haven’t found the will and time to fix it 😉

  6. Brian says:

    Sorry, but I’m not sure how to get to the output. I’ve changed the array to:

    $products = array(
    ‘id493837609’ => ‘Modules Unraveled Podcast’,
    );

    But when I run it in terminal, I don’t see where the output goes…

    Sorry if this is a total newb question, I’d just really like to see my podcast reviews.

    • Simone says:

      you should look at the output directory (which should have been created where macappstore.php is).

      There you’ll find a “493837609_Modules Unraveled Podcast.xml” file, if at least one review has been downloaded.

      By the way, you should remove the “id” part in the $products array: the key is just the numeric id.

        • Simone says:

          looks like podcast reviews can’t be fetched in the same way as apps (at first I thought yours was an app 😉

          You can try this, although it might not get every store data correctly.

          Or you can try this: http://pastebin.com/CG0sfzQQ (or mix that with my script: the biggest difference is the way it creates the X-Apple-Store-Front header. I just don’t have enough time to make that configurable at the moment…)

          • Brian says:

            Ok, thanks. I tried to poke around a bit, but I’m not a programmer, so, since this isn’t mission critical, I’ll check into it some other time.

            Thanks!

  7. EugeneT says:

    if you replace:
    $matches = explode( ‘-‘, $info );
    with:
    $matches = explode( ‘ -‘, $info );
    (add space) script will not skip day and month in dates formatted like DD-MM-YYYY

  8. Quaizar Vohra says:

    The script doesn’t seem to be working for me. Perhaps apple changes things again.
    I tried running the command manually and it asks me for a proxy password. Any chance you could update the script. Would greatly appreciate it.

    Thanks!

Leave a Reply to Brian Cancel reply

Your email address will not be published. Required fields are marked *