DCOM Overview

 

© 2000  Microsoft Corporation. All rights reserved.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.

Microsoft, Active Directory, Hotmail, Intellimirror, Outlook, Windows, and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

Other product or company names mentioned herein may be the trademarks of their respective owners.

Microsoft Corporation • One Microsoft Way • Redmond, WA 98052-6399 • USA

Introduction

Microsoft's distributed Component Object Model (DCOM) extends the Component Object Model (COM) to support communication among objects on different computers--whether on a local area network (LAN), a wide area network (WAN), or even the Internet. With DCOM, your application can be distributed in the locations that make the most sense for your customer and for the application.

Because DCOM is a seamless evolution of COM, the world's leading component technology, you can take advantage of your existing investment in COM-based applications, components, tools, and knowledge to move into the world of standards-based distributed computing. As you do so, DCOM handles the low-level details of network protocols so you can focus on your real business: providing great solutions to your customers.

With the growth of the Internet, Information Technology (IT) managers are once again excited at the prospect of using component software technology--the idea of breaking large, complex software applications into a series of pre-built and easily developed, understood, and changed software modules called components--as a means to deliver software solutions much more quickly and at a lower cost.

The goal is to achieve economies of scale for software deployment across the industry. A component architecture for building software applications will enable this by:

·        Speeding development--enabling programmers to build solutions faster by assembling software from pre-built parts.

·        Lowering integration costs--providing a common set of interfaces for software programs from different vendors means less custom work is required to integrate components into complete solutions.

·        Improving deployment flexibility--making it easier to customize a software solution for different areas of a company by simply changing some of the components in the overall application.

·        Lowering maintenance costs--isolating software function into discreet components provides a low-cost, efficient mechanism to upgrade a component without having to retrofit the entire application.

A distributed component architecture applies these benefits across a broader scale of multiuser applications. The Distributed Component Object Model (DCOM) has three unique strengths that make it a key technology for achieving this:

·        DCOM is based on the most widely-used component technology today (COM).

·        DCOM is the best networking technology to extend component applications across the Internet.

·        DCOM is an open technology that runs on multiple platforms.

The combination of these three factors—the largest installed base, native support for Internet protocols, and open support for multiple platforms—means that businesses can realize the benefits of a modern component application architecture without having to replace investments in existing systems, staff, and infrastructure.    

Where to Get DCOM

DCOM currently ships with the Microsoft Windows2000, Windows  NT® 4.0, and Windows 98. DCOM for the Microsoft Windows® 95 operating system is available for download at the Microsoft Web site.

DCOM implementations on all major UNIX platforms are available from Software AG (http://www.sagus.com).

DCOM Architecture Overview
This section starts you on a guided tour through the inner and outer workings of DCOM. You will soon see how DCOM realizes the promise of easy distributed computing without compromising flexibility, scalability, or robustness.

DCOM sits right in the middle of your application components; it provides the invisible glue that holds things together. The following figure shows how the pieces fit:

imgDCOM1
Figure 1 DCOM Architecture

At the center of COM are mechanisms for establishing connections to components and creating new instances of components. These mechanisms are commonly referred to as activation mechanisms. The following sections describe how they work.

One of the most basic requirements of a distributed system is the ability to create components. In the COM world, object classes are named with globally unique identifiers, or GUIDs. When GUIDs are used to refer to particular classes of objects, they are called Class IDs. These Class IDs are nothing more than fairly large integers (128 bits) that provide a collision free, decentralized namespace for object classes. If a COM programmer wants to create a new object, he or she calls one of several functions in the COM libraries:

Function

Description

CoCreateInstance(Ex) (<CLSID>)

Creates an interface pointer to an uninitialized instance of the object class<CLSID>.

CoGetInstanceFromFile

Creates a new instance and initializes it from a file.

CoGetInstanceFromIStorage

Creates a new instance and initializes it from storage.

CoGetClassObject (<CLSID>)

Returns an interface pointer to a "class factory object" that can be used to create one or more uninitialized instances of the object class <CLSID>.

CoGetClassObjectFromURL

Returns an interface pointer to a "class factory object" for a given class. If no class is specified, this function will choose the appropriate class for a specified MIME type. If the desired object is installed on the system, it is instantiated. Otherwise, the necessary code is downloaded and installed from a specified URL.

 

VB and VBScript developers can use CreateObject instead of these functions to instantiate COM and DCOM objects.  The underlying code generated by VB or the VBScript engine then calls the appropriate function to create the object.

The COM libraries look up the appropriate binary code (dynamic-link library or executable) in the system registry, create the object, and return an interface pointer to the caller.

For DCOM, the object creation mechanism in the COM libraries is enhanced to allow object creation on other machines. In order to be able to create a remote object, the COM libraries need to know the network name of the server. Once the server name and the CLSID are known, a portion of the COM libraries called the Service Control Manager, or SCM, on the client machine connects to the SCM on the server machine and requests creation of the object.

DCOM provides two fundamental mechanisms for allowing clients to indicate the remote server name when an object is created. The remote server name can be indicated:

1.    As a fixed configuration in the system registry or in the DCOM Class Store (see the next section for details).

2.    As an explicit parameter to CoCreateInstanceEx, CoGetInstanceFromFile, CoGetInstanceFromStorage, or CoGetClassObject.

The first mechanism, indicating the remote server name as a fixed configuration, is extremely useful for maintaining location transparency: clients need not know whether a component is running locally or remotely. When the remote server name is made part of the server component's configuration information on the client machine, clients do not have to maintain or obtain the server location. All a client ever needs to know is the CLSID of the component. It simply calls CoCreateInstance (or CreateObject in Visual Basic®, or "new" in Java), and the COM libraries transparently create the correct component on the preconfigured server. Even existing COM clients that were designed before the availability of DCOM can transparently use remote components using this mechanism.

Note that a server machine cannot forward creation requests to yet another machine using the RemoteServerName. If machine X uses RemoteServerName to indicate that objects of CLSID C should be created on machine Y, and machine Y has a RemoteServerName specified for CLSID C that points to machine Z, the objects requested by machine X will be created on machine Y. For more information, see "Referrals" in the "Connection Management" section of this paper.

For many applications, having a single, externally configured server name for each component is sufficient. It keeps the client's code free from having to manage server configuration data: if the server name changes, the registry (or the class store) is changed and the application continues to work without further action.

The remote server name is stored in the system registry under a new key in HKEY_CLASSES_ROOT (HKCR):

[HKEY_CLASSES_ROOT\APPID\{<appid-guid>}]
        "RemoteServerName"=<DNS name>

In turn, the class ID entry for the component has a new named value that points to the APPID:

[HKEY_CLASSES_ROOT\CLSID\{<clsid-guid>}]
        "AppId"="<appid-guid>"

The APPID concept was introduced as part of the security support in COM and is fully described in the "Securing Distributed Applications" section later in this document. The APPID essentially represents a process that is shared by multiple CLSIDs. All objects in this process share the same default security settings.

The APPID concept can be used to avoid redundant registry keys that all contain the same server name. CLSIDs that are known to always run on the same server machine (typically because they are implemented in the same executable or DLL) can all point to the same APPID key and thus all share the same RemoteServerName registry key.

The DCOM Configuration tool (DCOMCNFG.exe ships as part of Windows 2000, Windows NT, Windows 98, and DCOM for Windows 95) allows configuration of remote server names without the need to modify the registry directly. Figure 2 shows DCOM Configuration Manager on Windows 2000 with the remote server set to a server named AppServer.

Figure 2


Starting with Microsoft® Windows® 2000 Server, COM provides a central store for COM classes. All activation-related information about a component will optionally be stored in the Active Directory on the domain controller, just like the logon and authentication services store user credentials on the domain controller. The COM libraries will transparently retrieve activation information--including RemoteServerName configurations--from the Active Directory. Changing components' configuration information in the Active Directory will automatically propagate to all clients connected to this portion of the Active Directory.

Some applications require explicit run-time control over the server to which a client connects. Examples include chat applications, multiplayer games, and administrative tools that need to perform remote administration on specific machines.

For this kind of application, COM allows the remote server name to be explicitly specified as a parameter to CoCreateInstanceEx, CoGetInstanceFromFile, CoGetInstanceFromStorage, or CoGetClassObject. VB Developers can also specify the remote server name with CreateObject. The developer of the client code is in complete control of the server name being used by COM for remote activation.

Remote Method Calls: Marshaling and Unmarshaling

When a client wants to call an object in another address space, the parameters to the method call must somehow be passed from the client's process to the object's process. The client places the parameters on the stack.2 In the case of a direct object invocation, the object reads the parameters from the stack and writes return values back to the stack.

For remote invocations3, some code (typically the COM libraries) needs to read all parameters from the stack and write them to a flat memory buffer so they can be transmitted over a network. The process of reading parameters from the stack into a flat memory buffer is called marshaling. Parameter marshaling is non-trivial, since parameters can be arbitrarily complex; they can be pointers to arrays or pointers to structures. Structures can in turn contain arbitrary pointers, and many data structures even contain cyclic pointer references. In order to successfully invoke a remote method call with such complex parameters, the marshaling code has to traverse the entire pointer hierarchy of all parameters and retrieve all the data, so that it can be reinstated in the object's process space.

The counterpart to marshaling is the process of reading the flattened parameter data and recreating a stack that looks exactly like the original stack set up by the caller. This process is called unmarshaling. Once the stack is recreated, the object can be called. As the call returns, any return values and output parameters need to be marshaled from the object's stack, sent back to the client, and unmarshaled into the client's stack.

COM provides sophisticated mechanisms for marshaling and unmarshaling method parameters that build on the remote procedure call (RPC) infrastructure defined as part of the DCE standard. DCE RPC defines a standard data representation for all relevant data types, the Network Data Representation (NDR). In order for COM to be able to marshal and unmarshal parameters correctly, it needs to know the exact method signature, including all data types, types of structure members, and sizes of any arrays in the parameter list. This description is provided using Interface Description Language (IDL), which is also built on the DCE RPC standard IDL. IDL files are compiled using a special IDL compiler (typically the Microsoft® IDL compiler, or MIDL, which is part of the Win32 SDK). The IDL compiler generates C source files that contain the code for performing the marshaling and unmarshaling for the interface described in the IDL file. The client-side code is called the proxy, while the code running on the object's side is called the stub. The MIDL generated proxies and stubs are COM objects that are loaded by the COM libraries as needed. When COM needs to find the proxy/stub combination for a particular interface, it simply looks up the Interface ID under the HKEY_CLASSES_ROOT\Interfaces key in the system registry and reads the ProxyStubClsid key:

REGEDIT4
[HKEY_CLASSES_ROOT\Interfaces\{<IID_Interface>}\ProxyStubClsid32]
        @={<CLSID_ProxyStub>} 
[HKEY_CLASSES_ROOT \CLSID\{<CLSID_ProxyStub>}\InprocServer32]
        @="c:\proxy-stub.dll"

Connection Management

 The primary mechanism for controlling an object's lifetime is reference counting, using the AddRef and Release methods of IUnknown. AddRef and Release are called quite often, and sending every call to a remote object would introduce a serious network performance penalty. Hence, DCOM optimizes AddRef and Release calls for remote objects.

The optimization process uses OXID objects, which implement the IRemUnknown interface. An OXID (object exporter ID) determines the RPC string bindings used to contact a group of objects. Remote reference counting is conducted per interface (per IPID) using the RemAddRef and RemRelease methods of IRemUnknown. Using a single call, RemAddRef and RemRelease can increment or decrement the reference count of many different IPIDs by an arbitrary amount; this allows for greater network efficiency.

In the interests of performance, client COM implementations typically do not immediately translate each local AddRef and Release into a remote RemAddRef and RemRelease. For example, the standard proxy implementation defers the actual remote release of all interfaces on an object until all local references to all interfaces on that object have been released. Furthermore, one actual remote reference count is used to service many local reference counts.

Pinging

Remote reference counting would be entirely adequate if clients never terminated abnormally, but in fact they do, and the system needs to be robust in the face of clients terminating abnormally when they hold remote references.

Pinging is a well-known mechanism for detecting when clients terminate abnormally. At the server machine, each exported object (each exported OID) has a pingPeriod time value and a numPingsToTimeOut count, which combine to determine the overall amount of time known as the "ping period." If the ping period elapses without receiving a ping on an OID, all the remote references to interfaces associated with that OID are considered expired and can be garbage collected, based on local reference information.

Pinging on a per-object basis can be very inefficient, thus DCOM contains an optimized pinging infrastructure. Pings are sent and received by OXID resolvers. The resolver determines which OIDs are on the same machine and generates a ping set. A single ping is sent for the entire set.

For more information about the pinging protocol, see section 2.5 of the DCOM Binary Protocol.

Securing Distributed Applications

Designing a distributed application poses several challenges to the developer. One of the most difficult design issues is that of security: Who can access which objects? Which operations is an object allowed to perform? How can administrators manage secure access to objects? How secure does the content of a message need to be as it travels over the network?

Mechanisms to deal with security-related design issues have been built into DCOM from the ground up. DCOM provides an extensible and customizable security framework upon which developers can build when designing applications.

Different platforms use different security providers, and many platforms even support multiple security providers for different usage scenarios or for interoperability with other platforms. DCOM and RPC are designed in such a way that they can simultaneously accommodate multiple security providers.

All these security providers provide a means of identifying a security principal (typically a user account), a means of authenticating a security principal (typically through a password or private key), and a central authority that manages security principals and their keys. If a client wants to access a secured resource, it passes its security identity and some form of authenticating data to the resource and the resource asks the security provider to authenticate the client. Security providers typically use low-level custom protocols to interact with clients and protected resources.

Security and DCOM

DCOM distinguishes between four fundamental aspects of security:

·        Access security. Which security principals are allowed to call an object?

·        Launch security. Which security principals are allowed to create a new object in a new process?

·        Identity. What is the security principal of the object itself?

·        Connection policy. Integrity--can messages be altered? Privacy--can messages be intercepted by others? Authentication--can the object find out or even assume the identity of the caller?

The most obvious security requirement on distributed applications is the need to protect objects against unauthorized access. Sometimes only authorized users are supposed to be able to connect to an object. In other cases, non-authenticated or unauthorized users might be allowed to connect to an object, but must be limited to certain areas of functionality.

Current implementations of DCOM provide declarative access control on a per-process level. Existing components can be securely integrated into a distributed application by simply configuring their security policy as appropriate. New components can be developed without explicit security awareness, yet still run as part of a completely secure distributed application.

If an application requires more flexibility, objects can programmatically perform arbitrary validations, be it on a per-object basis, per-method basis, or even per-method parameter basis. Objects might also want to perform different actions depending on who the caller is, what specific access rights the caller has, or to which user group the caller belongs.

Another related requirement on a distributed infrastructure is to maintain control over who can create objects. Since all COM objects on a machine are potentially accessible via DCOM, it is critical to prevent unauthorized users from creating instances of these objects. This protection has to be performed without any programmatic involvement of the object itself, since the mere act of launching the server process could be considered a security breach and would open the server to denial-of-service attacks.

The COM libraries thus perform special security validations on object activation. If a new instance of an object is to be created, COM validates that the caller has sufficient privileges to perform this operation. The privilege information is configured in the registry, external to the object.

Another aspect of distributed security is that of controlling the objects themselves. Since an object performs operations on behalf of arbitrary callers, it is often necessary to limit the capabilities of the object itself. One obvious approach is that of making the object assume the identity of the caller. Whatever action the object performs--a file access, network access, registry access, and so on--is limited by the caller's privileges. This approach works well for objects that are used exclusively by one caller since the security identity is established once at object creation time. (For more information, see explanation of "Run as Activator" in "Fundamentals: Windows NT security infrastructure" later in this section. ) The approach can also be used for shared objects if the object performs an explicit action on each method call. (For more information, see explanation of "Impersonation" in "Programmatic Security" later in this section.)

However, for applications with large number of users, the approach of making the object assume the identity of the caller can impose problems. All resources that are potentially used by an object need to be configured to have exactly the right set of privileges. If the privileges are too restrictive, some operations on the object will fail. If the privileges are too generous (i.e., there is write access to some files where only read access is required), security violations might be possible if the object is not well behaved. Although managing access can be simplified by using user groups, it is often simpler to have the object itself run under a dedicated security identity, independent of the security identity of the current caller.

Other applications may not even be able to determine the security identity of the caller. Many Internet applications, for example, do not assign a dedicated user account for every user. Any user can use the application and yet the objects still need to be secure when accessing resources. Again, assigning objects a security identity of their own makes this kind of application manageable in terms of security.

As the "wire" between callers and objects becomes longer, the possibility of data that is being transported as part of method invocations being altered or intercepted by third parties increases. DCOM gives both callers and objects a range of choices to determine how the data on the connection is to be secured. The overhead in terms of machine and network resources tends to grow with the level of security. DCOM, therefore, lets applications dynamically choose the level of security they require.

Physical data integrity is usually guaranteed by the low-level network transport. If a network error alters the data, the transport automatically detects this and retransmits the data. However, for secure distributed applications, data integrity really means the ability to determine whether the data actually originated from a legitimate caller and whether it has been logically altered by anyone. The process of authenticating the caller can be relatively expensive, depending on the security provider and the security protocol it implements. DCOM lets applications choose whether and how often this authentication occurs (see the next section for details.)

DCOM currently offers two fundamental choices with regard to data protection: integrity and privacy. Clients or objects can request that data be transferred with additional information that ensures data integrity. If any portion of the data is altered on its way between the client and the object, DCOM will detect this and automatically reject the call. Data integrity implies that each and every data packet contains authentication information.

However, data integrity does not mean no one can intercept and read the data being transferred. Clients or objects can request that all data be encrypted (See the explanation of packet privacy in the next section). Encryption implies an integrity check as well as per-packet authentication information. Since both privacy and integrity require authentication, the mechanisms for specifying privacy and authentication are unified into a single enumeration of authentication levels, which are described in the next section.

The above mechanisms for access check, launch permission check, and data protection require some mechanism for determining the security identity of the client. This client authentication is performed by one of the security providers, which returns unique session tokens that are used for ongoing authentication once the initial connection has been established. The initial authentication often requires multiple roundtrips between caller and object. The NTLM security provider, for example, authenticates by challenging the caller: the security provider knows the password of the user (more precisely an MD4 hash of the password). It encrypts a randomly generated block of data using the MD4 hash of the password and sends it back to the client (the challenge). The client then decrypts the data block and returns it to the server. If the client also knows the correct password, the decryption is successful and the server knows that the client is "authentic." The NTLM security provider then generates a unique access token, which it returns to the client for future use. For future authentication, the client can simply pass in the token, and the NTLM security provider does not perform the extra roundtrips for the "challenge/response" protocol.

DCOM uses the access token to speed up security checks on calls. However, to avoid the additional overhead of passing the access token on each and every call, DCOM by default only requires authentication when the initial connection between two machines is established (RPC_C_AUTHN_LEVEL_CONNECT). It then caches the access token on the server side and uses it automatically whenever it detects a call from the same client.

For many applications this level of authentication is a good compromise between performance and security. However, some applications may require additional authentication on each and every call. Often, certain methods in an object are more sensitive than others. An online shopping mall might only require authentication on connection establishment as long as the client is only calling methods for "browsing" the shopping mall. But when the client actually orders merchandise and passes in credit card information or other sensitive information, the object might require calls to be individually authenticated.

Depending on the transport used and the size of the method data to be transmitted, a method invocation can actually require multiple data packets on the network. DCOM lets applications choose whether only the first packet of a method invocation contains authentication information (RPC_C_AUTHN_LEVEL_CALL) or whether each packet should be individually authenticated (RPC_C_AUTHN_LEVEL_PKT).
As discussed in the previous section, authentication, integrity, and privacy are tightly related. For this reason, DCOM defines a single set of constants that conveys the level of authentication and privacy. These constants are the same constants as defined for DCE RPC.

Object RPC (ORPC)

The DCOM protocol, known as Object RPC, or ORPC, is a set of definitions that extends the standard DCE RPC protocol. It has been designed specifically for the DCOM object-oriented environment, and specifies how calls are made across the network and how references to objects are represented and maintained. The ORPC protocol has been submitted as an Internet Draft to the Internet Engineering Task Force (IETF), as it is suited to both Internet and Intranet component communication.

At the wire level, ORPC uses standard RPC packets, with additional DCOM-specific information--in the form of an Interface Pointer Identifier (IPID), versioning information, and extensibility information--conveyed as additional parameters on calls and replies. The IPID is used to identify a specific interface on a specific object on a server machine where the call will be processed. The marshaled data on an ORPC packet is stored in standard Network Data Representation (NDR) format, so issues of byte order and floating point formats are automatically handled. DCOM uses one new NDR type, which represents a marshaled interface. DCOM client machines are also responsible for periodically ensuring that objects are kept alive on server machines by pinging between machines in the background, a process that has been optimized to reduce unnecessary pinging and to minimize network traffic (for more information, see "Pinging" in the "Connection Management" section earlier in this paper).

Programmers, for the most part, do not have to work at the ORPC level. The Microsoft Interface Definition Language (MIDL) compiler can be used to automatically generate the code that is needed to transfer the data across the network, based simply on an IDL file. Strictly speaking, MIDL is not part of DCOM and any tool can be used to generate marshaling code, but it is convenient to use MIDL, with its C-like semantics.

As part of the migration to DCOM, IDL has been extended to include the functionality found in Microsoft's Object Definition Language (ODL). As a result, the MKTYPLIB utility is no longer needed, its functionality having been subsumed into version 3.0 of MIDL. The MIDL compiler can take an IDL specification and generate the C++ code needed to transfer, or marshal, the information across the network. Marshaling is only required when the client is calling a server that exists in another address space or another machine. For more information on marshaling, see "Packaging Parameters and Objects: Marshaling" earlier in this paper.

MIDL was a key part of pre-DCOM, and in most cases, the standard proxy and stub marshaling code generated by MIDL are all that is needed to ensure that DCOM can communicate with a remote object. However, there are situations in a remoted environment when an object may wish to use its own form of custom marshaling, perhaps to optimize performance, as discussed previously.

Related Information

The Component Object Model Specification http://www.microsoft.com/com/resources/comdocs.asp

DCOM Technical Overview http://www.microsoft.com/ntserver/appservice/techdetails/overview/dcomtec.asp