Simple Object Access Protocol
by Paul Konigsberg (paulk@arsdigita.com)
Submitted on: 2000-09-25
Last updated: 2000-09-25
ArsDigita : ArsDigita Systems Journal : One article
What is SOAP?
Soap (Simple Object Access Protocol) is an easy way to invoke a remote
procedure call (RPC) on another machine. Executing a remote procedure call simply
means sending a request to some other machine, executing a program on that other
machine, and receiving the program's output on your original home machine.
One of Soap's primary goals is
to rely
entirely upon existing
technology in order to accomplish this. Therefore, it uses a flexible
XML specification which can be sent over HTTP or SMTP. Soap works on the
request/response model of making remote procedure calls. The client program
forms a request which consists of a valid Soap XML document and sends it over
HTTP or SMTP to a server. The server picks up the Soap request, parses and validates
it, invokes the requested method with any supplied parameters, forms
a valid Soap XML response and sends it back over HTTP or SMTP.
Why Use SOAP?
Soap is very similar to XML-RPC,
both try to accomplish the same goal of using the reliable technology of
XML to make remote procedure calls.
Both in fact have original contributions from some of the same authors.
Some of the minor differences between Soap and XML-RPC lie in the xml tag
structuring of the transmitted data, particularly multi-ref data.
Soap also makes use of a URN (Uniform Resource Name.)
Additionally Soap is a slightly newer specification.
Other popular choices for accomplishing remote procedure calls are over
the DCOM (Distributed Component Object Model)
or CORBA (Common Object Request Broker Architecture) protocols.
The Object Request Broker is a middleware between clients and servers.
In the CORBA model a client can request a service without knowing
anything about what servers are on the network. The ORB receives the
request, forwards it to the appropriate server, and hands back the result.
Both of these protocols
have shortcomings in server-to-client communications. Both protocols
rely primarily on a single vendor as well as specific platforms. (For CORBA this
means every machine runs the same ORB product. DCOM has been ported to other
platforms but is primarily a Microsoft Windows platform product.)
Both protocols perform poorly in Internet scenarios where a client and a server
may be running different operating systems, may have a slow and non-lossless
connection, and may be divided by proxy or firewall devices. These
proxy and firewall devices are a little more friendly to HTTP.
(SOAP uses HTTP whereas DCOM and CORBA use a proprietary protocol.)
One software package which tries to solve the RPC problem is Java's RMI
(remote method invocation.) Although RMI can be very useful
, using RMI requires the server application to be written in Java and,
for all practical purposes, the client application as well.
Soap applications can
be written in any programming environment for both the server and the client as
long as they meet the proper XML interfacing requirements. Soap supports
interoperation between Perl, Tcl, C, Python, and PHP evironments in addition
to Java.
RMI is also more applicable to complicated problems suited for a richer
distributed object model,
whereas Soap is simpler and more suited for executing isolated functions remotely.
They are meant for different classes of problems.
You might ask, "Why don't I just make up my own DTD
(Document Type Definition)
and use my scripting language of choice to take the form or URL
variables and push back the XML?" This is a good question and for one
small application might be an easier solution than installing all that
is required to parse through SOAP requests. With Soap the boring
work of translating your application data into XML and back is done
for you. You no longer have to worry about writing a lot of custom parsing
code or data
consistancy checks because these tasks are done by the SOAP processor.
In addition, you
are now taking the client application data in XML instead of form or URL
variables. XML data is structured whereas URL and form variables
are completely unstructured or have some arbitrarily imposed
organization. XML is completely text based which makes it easy to debug
and there are many XML parsers available on many different platforms.
Soap helps you fully leverage the power of XML.
The fact that Soap can be used over SMTP (instead of HTTP)
means that Soap
can be used as an asynchronous method of remote procedure calling.
Soap requests are emailed via SMTP to a mailbox on a server.
Soap over SMTP contains a process for picking up those emails and
executing the Soap request and then emailing the Soap
response back to the sender. Both the client and the server
can pick and respond to the Soap emails at any point in time.
This asynchronous methodology may prove useful for
applications executed over long time intervals, or run
over short connection times on low bandwidth modems.
In order to use the Soap protocol over HTTP the basic necessary components
are a web server, an extension by which the web server can
run or call programs on the server, an XML parser, and a Soap
interpreter. Some or all of these components may be integrated together.
|
Components for a Soap server over HTTP. |
SOAP Basic Syntax
SOAP requests are HTTP POST requests with the text/xml content-type. The following is an example
of a minimal Soap request.
POST /server_name/object_to_invoke HTTP/1.1
Host: 216.34.106.248
Content-Type: text/xml
Content-Length: 152
SOAPMethodName: urn:application_name:object_name#method_name
<Envelope>
<Body>
<m:method_name xmlns:m='urn:application_name:object_name'>
<data_field_1>Some Data</data_field_1>
</m:method_name>
</Body>
</Envelope>
THE SOAPMethodName is a URN (Uniform Resource Name). URNs are one type of
URI (Uniform Resource Identifier). A URL (Uniform Resource Locator) is
another type of URI. URNs follow the format of
urn:<NID>:<NSS>
In this format <NID> is the namespace identifier and <NSS> is
the namespace-specific string. The difference between URLs and URNs
is that URLs are used to reference objects with a specific location
and a specific access protocol. URNs are used to reference objects
which are independent of location.
Note that the SOAPMethodName must match the first
child element
of the Body element. This is mostly a security feature which lets firewalls
filter the call without parsing all the XML. If these two pieces do not match,
then the request will be rejected. Also note the use of HTTP version 1.1
relating to the support for chunked data transfers and keeping TCP
connections alive described in RFC 2616. Soap will work with
HTTP 1.0, but if you want to keep the connection alive in this case
it is recommended you add the following
HTTP header to the request and the response.
Connection: Keep-Alive
Here is an example soap response to the previous example soap request.
Note that the first child element of the Body element will have the same
name as the first child element of the Body element in the request
but with Response appended to it.
200 OK
Content-Type: text/xml
Content-Length: 162
<Envelope>
<Body>
<m:method_nameResponse xmlns:m='urn:application_name:object_name'>
<result_data_1>Some Result Data</result_data_1>
</m:method_nameResponse>
</Body>
</Envelope>
(Note the Content-Length numbers may not add up correctly for this example.)
Is There a Soap Example I Can See?
I built a small client Java application which takes a command line
argument and sends a soap request to a soap enabled server Java application
I also wrote. The server application connects to, and queries a database
with a query based on the command line parameter submitted to the client
application. The server application forms an XML response filled with data
from the result set of the query. The client application then prints
the XML data to the screen.
Here is the architecture I used when I
built these SOAP applications.
I used Apache as the web server and Jakarta-Tomcat as the Apache extension
which would be able to interpret requests for Java applications. Next, I had
to install the Xerces Java XML parser and then a SOAP package of utils for
creating and manipulating SOAP Java objects (just one more .jar file.)
(Note when you download the Soap package from apache.org it comes
with a few good soap examples: a calculator, an address book, and
a stockquote application.) The database I connected to was Oracle 8.1.6
and was already installed prior to this endeavor.
The software packages I used can all be downloaded for free from :
Here's a list of some other useful Soap-related sites.
Next is a listing of the code for the Soap examples I have created.
asj-editors@arsdigita.com
Reader's Comments
I'd like to thank Bill Schneider, Curtis Galloway, and Joe Bank for their editorial help and expertise in writing this article.
-- Paul Konigsberg, March 6, 2001
There is some misleading information regarding CORBA in the article.
It states that every machine must run an ORB from the same vendor,
which is false: an important principle in CORBA is interoperability of
implementations.The article goes on to argue that CORBA and DCOM perform poorly over a
slow connection. It is astonishing to use this as an argument in favor
of SOAP: the binary transport encodings used by these middleware
platforms use several orders of magnitude less bandwidth than the
verbose (wasteful) textual encoding used in SOAP.
The next point made is that HTTP passes through firewalls more easily
than the CORBA and DCOM protocols. While this is currently true in
practice, it overlooks the fact that firewalls are open to HTTP
because it has historically had limited security implications. Given
that HTTP streams may now contain complex client-server interactions,
firewall vendors will need to examine the semantics of the traffic at
the encapsulated level. It is arguably _more difficult_ to do this for
SOAP traffic than for CORBA or DCOM requests.
The paragraph concludes by characterizing DCOM and CORBA's protocols
as "proprietary". In both cases (respectively DCE RPC and IIOP) the
protocol is standardized by an international body, and there exist
multiple independent, interoperating implementations.
While SOAP does have several things going for it, they are more
socialogical than technical in my opinion: it is buzzword enabled,
backed by Microsoft, builds on well understood Internet mechanisms,
you can encode requests by hand. It is also much less ambitious than
platforms such as CORBA, and thus easier to learn. It is a sad
refelction on the state of computing that these factors probably
overweigh technical criteria.
-- Eric Marsden, March 13, 2001
I agree with the technical arguments, but I don't think CORBA/DCOM are a downright technically better altenative over SOAP. Yes, at machine level SOAP is less efficient. But it is simpler, easier to learn and easier to inter-operate across different platforms.Inter-operability and programmer-time are generally far more expensive than machine-time. Think about integrating chaotic environments (different platforms, enterprises, ...) So people (including me) conclude that SOAP has a lower TCO, and is therefore technically better in a lot of cases.
Due to the obvious performance advantages and richer feature set, DCOM/CORBA still have their place in the sun. And if your environment is not so chaotic, you wouldn't be wrong in choosing among them.
-- Edmar Wiggers, May 22, 2001
I'm new to SOAP but let me venture into the SOAP versus the rest of the world argument. SOAP is a good way to call a remote service running on a Web server and to pass the service parameters and receive results. Considering how new SOAP is, that's about all there is too it today.Corba has been around for years and offers the mature services one needs to build a distributed system, including Time services, encryption and authentication services, real marshalling of objects, and a transport protocol (IIOP).
If I was trying to build a distributed system for a Bank I would choose Corba. If I was trying to offer a calorie counter application for Weight Watcher's I would choose SOAP.
There has always been a struggle in the software world between monolithic-thinking people and monolithic-thinking people. The monoliths love Corba because it is complete. The pragmatists love SOAP because it is leight-weight. A pragmatist expects all the parts to a complex system will come together as the result of a bunch of disparate engineers working on individual functions. The monolith needs to guarantee deliver of an entire system.
And the battles between them are fascinating.
-Frank
-- Frank Cohen, August 2, 2001
Some general observations about SOAP and it's
contribution to the world if distributed computing...It is a common misconception that SOAP is somehow
-easy- to work with.
A SOAP request or a WSDL file is not normally readable
by a Human Being unless it was written by a Human
Being. A WSDL file for example is a very complicated
format full of envelopes, messages, ports, parts,
types and bindings and very little information is
available on how it works, on less than scientific
level. Being able to read an xml begin and end tag is
not enough even though most experts will tell you that
this is the case. You have to understand the *meaning*
of the tags and no DOM parser will give you that no
matter how smart it is.
Imagine buying a car and it comes in pieces with no
instructions on how to put it together. However all
the pieces are standard and any car can exchange parts
with any other car. Unless of course they are a
Microsoft car in which case only green parts are
interchangable and only if you use Microsoft tools.
The missing instructions are also out there, but you
have to find out for yourself where to find them and
the instructions are written by the car inventors for
other car inventors, not for you.
I hope you get the point. SOAP is not just SOAP. SOAP
is Apache Tomcat, Xerces, WSDL, XML, DOM, Schemas,
URI's, URN's, RPC routers, XSD and of course all the
technical details of setting up your Java development
environment with all the right packages you have to
find first on various open source web sites.
And once you have all that you can start to think
about how to deploy it to your customers.
You just have to be an expert on too many things at
once and that is a very negative thing about SOAP -
really, I mean that. CORBA is a heck of a lot easier
to understand and work with.
So please stop giving people the impression that SOAP
is so playfully easy. It's not.
-- Michael Berg, August 16, 2001
I am a distributed computing specialist practising in the field for over 10 years. I do firmly believe that CORBA and COM approaches are tight and good. But, given the issues like availability of bandwidth, web as network and proven reach of technology using browsers; i think SOAP is better!
Actually, its better even when using generic frameworks - like MOM's, as it gives more leverage in terms of utilizing the knowledge and also opens the doors to connect the system to diverse systems.
I think we should rather take SOAP in light of oppurtunity to make our systems easily available and hence integrate-able rather than it replacing some existing technology.
After all technologies existed and advanced all this years and will do so. But the main point is the newer technology should not forget the old ones and learn to interact with them. SOAP serves this purpose and also adds level of simplicity, which is of esence when devising large systems distributed over huge spans.
I think SOAP would definitely help us achieving the Open System standards all of us dream of!
Thanks for taking the time to read my note. Appreciated.
-- Ajay MR, August 27, 2001
As a user and proponent of XML, HTTP and Java I wanted to add the following. SOAP evolved as a standard XML packaging format for putting payloads on the wire. It was a wire protocol that would level the XML semantic playing field so systems could communicate effectively [interoperate?] over HTTP. A year ago it didn't make sense to compare it to COM/DCOM or even Corba. There was very little in the SOAP spec that adressed many of the issues already handled by DCOM or Corba. Now however, there are many proposals to help solve these problems and compete in the DCOM/Corba space. In fact it looks like SOAP will now try to solve all those problems encountered by the previous frameworks all over again. XML and HTTP are key, but the abstractions, overhead and complications [drop the "Simple" part]being introduced in SOAP may end up driving it from adoption. In come the vendors and there is further divergence from the original interoperability goals. Thanks, Bob Bisantz
-- bob bisantz, November 28, 2001
Related Links