Â鶹Éç

« Previous | Main | Next »

A brief technical overview of the Â鶹Éç personalised mobile homepage

Post categories:

Mark Longstaff-Tyrrell | 16:45 UK time, Wednesday, 20 January 2010

Most people I know are never more than a few feet away from their mobiles. They provide access to email, text and internet, including social networking sites. They're fashion accessories. They're becoming increasingly personal pieces of equipment and in the same way that junk mail landing on your doormat feels intrusive, so it is with mobile websites cluttering up your screen with unwanted content.

When it was released in April last year, the Â鶹Éç personalised mobile homepage aimed to address this by allowing many aspects of the page to be personalised.

This post aims to explain some of the technologies we've developed to generate personalised mobile pages that we hope provide you with the best experience, regardless of the make and model of your device.

Rendering pages for different devices

The appearance of the mobile homepage changes depending on the device you use to view it. If you browse the mobile homepage using a touchscreen device such as an Android or iPhone, you'll be presented with larger text and images to make it easier to navigate with your finger.

Non-touchscreen phones that use a trackball or buttons to navigate will present a more compact page. In addition, links to mobile iPlayer and other multimedia content is selectively displayed based on the phone capabilities and whether it's connecting over WiFi or 3G.

In order to do this, the page content is defined using a device-independent XML representation. Each tag in the device-independent XML is then translated into a fragment of XHTML appropriate to the capabilities of the client handset. These XHTML fragments make up a library of common components, such as links, headings and list items. This approach means that the look of the site is maintained throughout and also that the entire design can be updated by simply changing the templates.

Here's an example. The following XML describes the Radio & Music topic:

<header editable="true" text="Radio &amp; Music" url="/mobile/radio/"/> <now_on_air title="NOW ON AIR"/> <list style="plainList"> <channel_list-item channel_url="/mobile/radio/radio1/index2.shtml?region=london" channel_name="Radio 1" brand_url="b00pjl2g" brand_name="Greg James"/> </list> <list style="boldList"> <list-item text="More stations and schedules" url="/mobile /customise/11"/> </list> <list style="audioList"> <list-item demi="15" text="Podcasts" url="/mobile/radio/podcasts/index.shtml"/> </list>

The XML above renders like this on an iPhone:

iphone.png

And like this on a Nokia 6331:

nokia.png

On the 6331 version the text and image sizes are reduced to account for the smaller screen and the podcast link is hidden.

Here's a simplified diagram of the flow during a mobile page request:

diagram.png

Page personalisation

Successfully navigating over 60 regional news areas, 17 sports categories, 181 football teams, 18 news topics, 9 radio stations and 6 TV channels on a mobile device requires some organisation. To this end many of the topics on the mobile home page can be personalised to show only the information you're interested in. When you personalise your page the personalisation settings are stored in a cookie on your device. Due to the number of personalisation combinations available a cookie format had to be designed to store these settings efficiently to reduce the storage space consumed on the device, while being flexible enough to allow future development.

In the end we settled on the format shown below:

11_3_8_4___G9_10__CD11__CK12_14_15_16_

Each of these fragments represents a topic on the homepage and its personalisation settings. The position of the fragment in the cookie determines the order in which the topics appear in the page.

Topics that can be personalised contain extra information in their fragment that represents the personalisation state of the topic. For example, the fragment '10__CD' describes the 'Television' topic and can be split into three fields: '10', '__' and 'CD'. The '10' is the topic ID, the next two characters are used to store the TV region as a 2 digit base 42 (b42) number and the rest of the fragment stores the selected TV channels. In this case the channels are 'CD' which correspond to Â鶹Éç1 and Â鶹Éç2. Adding Â鶹Éç3 to the page changes the fragment to '10__CDF'. Both the channels and the order in which they will appear in the topic are stored. The formats of the other topics vary depending on the information to be stored and are outlined briefly below. We don't use vowels in the configuration cookies to avoid spelling unfortunate four letter words. With this many combinations they're bound to occur.

The topic IDs and their personalisation encodings are as follows:


1 Promo - none
3 News - 2 character b42 region + n character feed list
4 Weather - 3 character b42 region + 1 character b42 display format
8 Sport - 2 character b42 region + n character feed list
9 Entertainment - none
10 Television - 2 character b42 region + n character feed list
11 Radio & Music - 2 character b42 region + n character feed list
12 iPlayer - none
14 Featured Sites - none
15 Search - none
16 MyClub - 4 character club ID + 2 character b42 display format

The characters used for the base 42 encoding are:

'_CDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz'.

If you're so inclined, you can play about with the configuration format to see how it works. Paste the following URL into your desktop browser, edit the configuration and see what happens.

/mobile/ps/11_3_DfCV8__GW4__DG9_10_DGD11__HM12_14_15_16__C___/?bookmark

You'll notice that you can't remove the promo or search topic. These are now permanently part of the page but still appear in the personalisation settings. They'll be removed in the future.

Scaling personalised applications

The system described so far has all the functionality required, but in order to cope with the high load demanded caching must be used to reduce the load on the servers. 'Caching' is the process of storing in memory a piece of data that takes time to be rendered or downloaded so that next time you need it you can simply look it up.

Non-personalised pages are relatively simple to cache as each user sees the same page. But personalised pages, where each user has a personal view on to a page, requires a little more thought; a user in Manchester doesn't want to see the weather for a user in Birmingham.

The problem we have is that the number of combinations of personalised pages is huge. The order of topics in the page alone gives us over 300,000 combinations. So what we've done is to cache the individual page topics separately, rather than the complete pages. When a client request is received for a particular topic order, the topics are simply retrieved from the cache in that order and concatenated to form the complete page. This immediately reduces the potential amount of data to be cached by a few orders of magnitude, but there's still the problem of the topic content.

For example the news component has over a million combinations and that's before all the regional news feeds have been factored in. We can't cache them all. Luckily there are a couple of things on our side:

  1. Not all personalisation combinations are equally represented: It turns out that nearly 70% of all requests are for the same couple of dozen personalisation combinations. The last 30% is still a large number, but the majority of the load can be effectively managed by caching.
  2. We don't want to cache the components forever: Many topics contain dynamic data such as news stories that need to be updated periodically. Caching components for just a few seconds is enough to considerably reduce the number of requests per second while not filling up the server cache. Perhaps in the future we could exploit the observation in point 1 and employ a more intelligent caching system where the more popular configurations are cached for longer.

Conclusion

The mobile homepage is still under constant development and there are many aspects of the system that can be improved. Mobile development is still a relatively new discipline and presents its own unique problems. But it's by building novel systems that we develop the techniques to solve them, some of which may find uses beyond their original application.

I hope this has provided an insight into some of the work that we do here at Â鶹Éç Mobile. With ever more powerful devices becoming available it's a very exciting field to be working in at the moment. I hope we can help to make it an exciting experience for you too.

Mark Longstaff-Tyrrell is a software engineer on the Â鶹Éç mobile platform.

Comments

  • Comment number 1.

    Interesting behind the scenes insight.

    Now when will we be able to browse Â鶹Éç blogs on a mobile device......

  • Comment number 2.

    Thanks for the interesting peek behind the scenes.

    Can anyone explain why when I select Swindon as my location I get the weather for Salisbury, even though Swindon does have a mobile weather page?

    Back in May last year I raised this with the Â鶹Éç and was sent an email saying: "We've passed on this issue to our development team to investigate - there do appear to be some issues with the system that matches postcodes to forecasts.

    I can't give you an exact timetable for a fix on this but please be assured that the matter is being looked at."

    Any news on a fix, please? Thanks!

  • Comment number 3.

    Having access to the most recent blog list that the main bbc home page has would be useful (ideally with a few more blogs on it)!

  • Comment number 4.

    Why can I listen again to some radio on iPlayer via my iPhone, but not listen to radio live, even if I'm connected via WiFi?

  • Comment number 5.

    Hello - this is not a general post for queries about mobile. Can people stay on topic please.

  • Comment number 6.

    Nice post, but i didnt see anything about the technology behind all the numbers.

    What tech do you use for the cache? memcache, database, open/closed source?

    What language(s) is it developed in?

    Does it run on a cluster or some big iron.

    Its would be nice to have a bit of insight into the technology that can cope with such demand.

  • Comment number 7.

    5. At 11:28am on 21 Jan 2010, Nick Reynolds wrote:

    "Hello - this is not a general post for queries about mobile. Can people stay on topic please."

    Hi Nick, just a little observation, is this blog actually on-topic for this section? Well I know it is, my point being, doesn't it actually fit better within the "Web Developer" section, with perhaps a link back here, it seems to be well 'over the top' (technically speaking) here, the XML code samples are leaving many people with glazed-over eyes I suspect and thus a total miss-understanding of what the topic actually is...

    But that said, thanks for giving it space!

  • Comment number 8.

    "as a 2 digit base 42 (b42) number ... don't use vowels in the configuration cookies to avoid spelling unfortunate four letter words."

    I can understand both
    - a cookie being shown in alphanumeric form rather than say hex
    (it makes the cookie shorter)
    - avoiding vowels (as quoted from blog)

    Is this a common standard or from a Douglas Adams SciFi fan, number 42

  • Comment number 9.

    @Dr_Bean - we do not currently have the infrastructure to offer live streaming of television or radio content to iPhone or iPod Touch devices. We are looking into providing this functionality in a future version of iPlayer.

  • Comment number 10.

    @drt - the mobile weather component is currently being updated after which it will be able to resolve local areas. This should be live in 6 weeks or so.

  • Comment number 11.

    @iainhubbard - we use memcached and MySQL, with software written in PHP and Java. The mobile homepage, like an increasing number of Â鶹Éç sites, runs on the Forge platform. This is a collection of software and systems for developing and deploying large scale web applications. It's worthy of a blog post of its own, but in the meantime here's a link to some slides from a presentation about it last year (pdf):

  • Comment number 12.

    Thanks Mark for your reply, looking forward to the update.

  • Comment number 13.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 14.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 15.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 16.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 17.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 18.

    This comment was removed because the moderators found it broke the house rules. Explain.

Ìý

More from this blog...

Â鶹Éç iD

Â鶹Éç navigation

Â鶹Éç © 2014 The Â鶹Éç is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.