How much does your web browser weigh? Mine weighs about
150 grams and fits in my pocket. Sound hopelessly expensive or
esoteric? The browser runs inside a USD$60 cell phone, and according
to a study cited recently in the Industry Standard (http://www.thestandard.com/research/metrics/display/0,2799,15258,00.html),
some 484 million people will be accessing the Internet via mobile
devices by 2005.
Reaching these users will take more than persuading them to go through
the pain of entering your long URL on their miniaturized keypads. The
central challenge is to build a useful service that is well-suited to
small screens and requires minimal user input. The reward for meeting
these challenges is the opportunity to expand the application space,
providing your users with new types of services previously unavailable
on the Web.
There are a number of standards for development of wireless web
applications, but the momentum in May 2000 is strongly behind
the Wireless Application Protocol (WAP) architecture. This standard
is written and maintained by the WAP Forum (http://www.wapforum.org), an
industry consortium including Phone.com (formerly Unwired Planet),
Nokia, Motorola, Ericsson, and many more. Most Web-enabled cell
phones on the market today support WAP. Alternatives for wireless
application development include the Handheld Device Markup Language, a
predecessor to WAP which is still supported on many cell phones, the
Palm Query Application a.k.a. "Web Clipping" (Palm Pilot
VII), and text-only HTML (Qualcomm pdQ smartphone, NTT DoCoMo i-mode
phone in Japan).
How Does It Work?
You can serve content to wireless devices with a standard Web server:
After the WAP Architecture Specification, Figure 2.
The cell phone connects to your web server through a service
provider's gateway. The gateway translates encoded requests from the
phone to standard HTTP GETs and POSTs, fetches the results, encodes
the results, and returns the results to the phone. Data are passed
between the gateway and cell phone in binary form for compact
transmission, using a WAP-specific set of protocols. The gateway also
manages the relatively unreliable network connection to the phone.
WAP content is delivered in Wireless Markup Language (WML) format.
WML supports a somewhat different set of user interactions than HTML,
and adheres strictly to the XML standard.
To serve WAP devices, the Web server simply needs to return WML with
the appropriate headers and tags:
HTTP/1.0 200 OK
MIME-Version: 1.0
Content-Type: text/vnd.wap.wml
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<card>
<p>Hello, world!</p>
</card>
</wml>
|
WML documents are designated by the text/vnd.wap.wml
Content-Type and must all be prefaced with tags indicating the XML
version and document type definition. WML documents consist of any
number of discrete page views, or cards. A complete document
consisting of several cards is referred to as a deck. The
main distinction between a deck of cards and a collection of HTML
pages is that the entire deck is loaded atomically. This provides a
better user experience on browsers with small displays on slow,
unreliable networks.
The deck in the "Hello, world" example contains just a
single card with a static message. Here is a more elaborate example:
HTTP/1.0 200 OK
MIME-Version: 1.0
Content-Type: text/vnd.wap.wml
<?xml version="1.0"?>
<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
"http://www.wapforum.org/DTD/wml_1.1.xml">
<wml>
<!-- This is the first card. -->
<card>
<do type="accept" label="Answer">
<go href="#card2"/>
</do>
<p>What is your name:
<input name="Name"/></p>
</card>
<!-- This is the second card. -->
<card id="card2">
<do type="accept" label="Answer">
<go href="#card3"/>
</do>
<p>What is your favorite color?
<select name="Favorite">
<option value="red">Red</option>
<option value="blue">No, blue!</option>
</select></p>
</card>
<!-- This is the third card. -->
<card id="card3">
<p>Name: $(Name)<br/>
Color: $(Favorite)</p>
</card>
</wml>
|
What it might look like on your phone:
First Card |
Second Card |
Third Card |
|
This deck consists of three cards. The first two prompt the user for
input and the third displays the results. Results are passed between
cards using the document-level variables Name and Favorite. Variable
substitution is effected by prefixing variables names with a dollar
sign ($) and enclosing them in parenthesis. Navigation between cards
is handled with go
tasks inside card-level
do
elements. The do
element provides a
general mechanism for users to act on cards. The type
attribute indicates what kind of action to associate with the
do
. An "accept" action occurs when the user
hits the OK button on her cell phone. The go
task plays
a role similar to the HTML form
element, containing a
target URL and optionally query data and method type. The notable
difference here is that user interface code appears outside
the go
task.
Development environment
Most discussions of WAP-development jump
immediately to the various software development toolkits available
from the cell phone and browser vendors (listed below). Consider first the caveman-style approach to
developing WAP applications. You sit at the computer with your cell
phone. You type away at the keyboard making changes to server code.
You load the pages on your cell phone. You iterate. This approach
has the advantage of realism: you're forced to tap out URLs using the
smaller phone keypad, view content on the small screen, and wait for
the slow network to respond. The downside? You're forced to tap out
URLs using the smaller phone keypad, to view content on the small
screen, and wait for the slow network to respond. Oh yes, and then
there's the matter of those per-minute microbrowser charges from your
service provider.
Real development requires a mixture of simulators and phones with
emphasis on the former. The simulators are particularly useful for
exploring navigation and user experience strategies, checking
dynamically generated WML source, refreshing cached pages, and
tracking cookies. Telnet can also come in handy, as exemplified
below.
Case studies
Case 1: Conference seminar schedule
I was asked to develop a WAP interface to the developer conference
site ArsDigita is building for a Fortune 500 software company.
My colleagues and I decided that the most natural WAP application for
this kind of site would be a seminar schedule that enables conference
attendees to look up seminar topics, times, and locations at the push
of a button.
At the push of a button. The first thing that hits you when working
with WAP phones is the tedium of entering site URLs on numeric
keypads: each letter is entered by pressing the appropriate number key
and then, if necessary, repeating to select one of three or four
letters associated with the key. This point was driven home when an
MIT student asked me to review a WAP page he was working on for the
web applications class I help teach at MIT (http://6916.lcs.mit.edu). The URL
for the page was something akin to
http://lcswww75.lcs.mit.edu/register/wap-login.tcl. Typing this URL on
a Samsung SCH-3500 takes about 85 keystrokes after legally omitting
the "http://", with roughly half of the keystrokes devoted
to "/register/wap-login.tcl" (As an aside, this should teach
you something about the economics of WAP portals.) When I asked the
student why he hadn't provided navigation through a shorter URL, he
said, "Oh, I always used a PC-based simulator so it never occurred to
me."
This situation can be ameliorated by having the top-level index page
try to guess the appropriate document type and redirect the browser
accordingly. Here is the algorithm used for the developer conference site:
- Pull the User-Agent string out of the request headers and compare
the first four characters against a list of known WAP microbrowsers (an
updated list is maintained at http://wap.colorline.no/wap-faq/useragents.php3).
- If no match is found, redirect to an HTML page.
- If a match is found, pull the Accept string out of the request headers and search for "
text/vnd.wap.wml
".
- If no match is found, redirect to an HTML. Otherwise, redirect to
a WML page
This scheme is somewhat pessimistic in that WML is only served to a
limited set of known WAP user agents. Is this reasonable? The Accept
header should provide all the information necessary to perform content
negotation. Sadly, a number of browser vendors do not obey
HTTP 1.1 in this regard. The popular UP.Browser, for example,
includes text/html
in the list of accepted content types
in spite of the fact that it fails upon trying to load HTML. Curious
to see how other webmasters approached the problem, I performed a
little experiment on www.yahoo.com:
bash-2.01$ telnet www.yahoo.com 80
Trying 216.32.74.55...
Connected to www.yahoo.akadns.net.
Escape character is '^]'.
GET / HTTP/1.1
User-Agent: UP.Browser
Accept: text/html
HTTP/1.0 302 RD
Location: http://phone.oa.yahoo.com/http://login.mobile.yahoo.com/
Connection closed by foreign host.
|
Despite the request for text/html
, Yahoo! redirected to a
phone site. I could also download HTML with an Accept header
containing only text/vnd.wap.wml
. Bottom line is that
with the current state of protocol compliance, the User-Agent header
conveys better information than the Accept header.
At the heart of the conference schedule service lies information about
individual seminars. This information is served dynamically from the
same database tables that are used for the HTML version of the site.
However, the display and user experience constraints are quite
different for desktop web browsers and wireless devices. With long
seminar titles like "Power Development Strategies using Java
Technology, XML, J2EE, and Oracle8i", for example, even a
moderately long list of seminars could make for a great deal of
scrolling on a 12-character wide screen. There are basically two ways
to reduce the need for scrolling. The first approach is to pass all
strings through a function which truncates to an acceptable length,
perhaps truncating preferentially at whitespace. Drill-down
navigation through the conference schedule is handled in this way. A
second approach is to override line-wrapping by using paragraph tags
with the mode
attribute set to
"nowrap"
.
Building the conference schedule user experience was an exercise in
navigation design: help the user arrive at seminar cards as
efficiently as possible. The conference is a travelling show making
stops in major cities around the world, so we first prompt the user
to choose a city from a short list including the meeting currently in
session and the five upcoming sites. We also present an option to
view a list of all cities including past events. Individual seminars
are organized into a number of themed tracks which run simultaneously
over the course of two days. Having picked a city, the user is given
the choice of locating seminar information by track or by starting
time, followed by a list of tracks or starting times. Upon picking a
particular track or time, the user is presented with a list of
seminars each linking to a card with the seminar name, location, time,
and track.
Since a number of cards can be stacked into one document it is worth
acknowledging that you have some choice in structuring WML documents.
In theory, an entire site could be loaded into a single deck! In
practice, most microbrowsers have document size limit of about
1200-1500 bytes. Furthermore, the publisher may wish to instruct the
browser to selectively not cache some information. Here is the site
structure for wireless users:
Case 2: ArsDigita employee phone directory
My original motivation to learn this material came from a simple wish:
to be able to easily contact any of my colleagues at ArsDigita, even
if none of us happened to be logged in or at our office desks. All of the
needed contact information (home, office, and mobile phone numbers)
was already web-accessible through the ArsDigita Community System's
intranet module (http://www.arsdigita.com/doc/intranet),
which we use to run the company.
As with the conference calendar, the primary task was to put a WAP
front end on existing data. ArsDigita's intranet is, unlike our
software, not open to the public. User login pages were needed to
restrict access. These pages were implemented in a parallel fashion to
the existing HTML pages, calling the same backend scripts but instead
presenting the results in WML. Just like in the ArsDigita Community
System (ACS), cookies are used to identify logged-in users. In my
experience cookies work well provided you do not deviate from the
magic cookie spec (available from http://home.netscape.com/newsref/std/cookie_spec.html).
Notably, avoid using commas, spaces, and semi-colons in the cookie
value. The user's email address on our intranet is also their
username, so to save typing, a default email domain
("@arsdigita.com") is appended if necessary when the user
logs in.
In May 2000, ArsDigita is about 160 people. To look up an individual's
information, I implemented a search tool instead of drill-down
navigation. Searches may be performed on the first few characters of
the user's first name, last name, or email address (by which some of
us are primarily known). The search tool will return a single employee
card if one match was found, a list of employees (linking to employee
cards) if 2-10 matches are found, and a request to narrow the search
if more than ten employees are found.
Thanks to the Web Telephony Application Interface (WTAI) it is a
fairly simple matter to "link to" a phone number, saving the user from
otherwise having to:
- copy or memorize the requested number,
- quit their microbrowser,
- dial the number to make the call.
The WTAI provides a means to create telephony applications from within
a WML page. WTAI functions are named using uniform resource
identifiers (URIs) resembling
wtai://<library>/<function>(;<argument>)*
.
Given a phone number from a search, e.g. 617-555-1212, the following
link
wtai://wp/mc;6175551212
enables dialling. This link utilitizes the Make Call
function (mc
) in the Public WTAI library
(wp
).
The dark side of this scheme: there are a variety of ways to lump
phone number digits depending on what country you happen to live in.
For this reason, the keepers of our Intranet have been content to
store phone numbers as free-form SQL varchar
s rather than
segmented into integer
fields such as the North American
3-digit area code + 3-digit exchange + 4-digit suffix. What I
do is search the phone number field for a 3-3-4 pattern and link to it
if a match is found. If no match is found, I pull out all digits from
the field and provide the user with an edit box prior to linking.
This fallback strategy, while not as elegant, still saves the user
from having to copy and re-enter the digits.
Beyond the basics
Learning a new markup language isn't particularly interesting in and
of itself. WML could die. That might even be nice---one less markup
language to learn. Mobile accessibility is certainly valuable, as it
allows you to share your services with a much larger class of users.
Serving up linkable phone numbers to a phone is even more interesting,
since it provides a new type of service previously unavailable on the
Web. Another new type of service, which is perhaps not yet a reality,
is the marketers' dream of personalized, geo-spatialized advertising.
Imagine turning on your WAP browser mid-day to find the following
message looking back at you:
Hungry for pizza?
PizzaShack
123 Main Street
Only 3 blocks away!
1> Large pie
2> Large Soda
-------------------
Order
|
To summarize, beyond accessibility, the real value of WAP lies in
application frameworks like the WTAI.
In the examples above we created sets of WML pages to serve
essentially the same content as their sister HTML pages, with little
effort or emphasis devoted to consolidation of common tasks like
pulling data out of the database. The maintainable approach is to use
a single procedure to obtain data, and then any of several separate
rendering tools on the server or browser to generate appropriate
markup. This approach is typically what is being described when the
eXtensible Markup Language (XML), XML style sheets (XSL), and the XSL
style sheet tranformation language (XSLT) are invoked. Note that this
approach does not necessarily eliminate the need to maintain parallel
sets of display information: if you add a field to the lone
datasource, you'll still have to update all templates or stylesheets
with directions on where the new field is to appear. Note also that
the need to maintain parallel sets of display information is not
directly tied to the existence of multiple markup languages per
se. A truly impressive service would distinguish among
additional dimensions such as spoken language, bandwidth, and screen
size.
Employee phone directory: Retrospective
After 3 months' of use, the ArsDigita employee phone directory has
proved stable, if still catching on. One pitfall revealed by early
use was that certain HTML escape characters do not constitute valid
WML. Searches that returned a German colleague's name, for example,
resulted in an error message becuase ü ("u" with an
umlaut) was encoded in the database as ü, whereas WML
requires the alternate encoding, ü.
More
General WAP/WML information:
SDK's and WAP-enabled browsers are available from
Web Application Programming tools
Other