Windows Server 2008 read only domain controllers

This content is 17 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

This is the last post I’m intending to write based on the content from the recent Windows Server UK User Group meeting – this time inspired by Scotty Mc Leod‘s presentation on read only domain controllers (RODCs), a new feature in Windows Server 2008.

In my post from a few weeks back about some of the new features in Windows Server 2008, I wrote:

Backup domain controllers (BDCs) are back! Except that now they are called read-only domain controllers (with unidirectional replication to offer credential caching and whilst increasing the physical security of remote domain controllers, e.g. in branch offices).

That statement was slightly tongue-in-cheek and, if taken literally would be inaccurate. RODCs are more complex than Windows NT BDCs were. Active Directory still uses a multiple master replication model, but RODCs are really a means of providing a read-only replica of the directory (with outbound replication disabled) – for example at remote sites where to have a fully-functional domain controller would be a security risk. As far as Active Directory is concerned, an RODC is not a domain controller – it actually has a standard workstation account (with some extra attributes).

This has a major advantage in that, unlike a domain controller, an RODC has a local account database, with a local Administrators group (of which Domain Admins will be a member). In effect, this means that a user can be made a full administrator of the RODC, without needing to be a Domain Admin.

In order to create an RODC, the forest and domain need to be at Windows Server 2003 forest functional level with at least one (preferably more) Windows Server 2008 DC present. The forest and domain must also have been prepared for RODCs with adprep /rodc.

The next stage is to provision the computer account, selecting a site, and whether or not DNS/Global Catalog services will be enabled). Control over the information stored on an RODC is controlled with password replications policies – allow/deny lists for replication of passwords based on users, groups or computers. 2 new groups are created – DeniedRODCPassword and AllowsRODCPassword and as for other Windows NT ACLs, deny takes precendence over allow. Next, it’s necessary to define who will manage the RODC – this effectively defines a user account that can administer the server without needing Domain Admins membership (e.g. to apply patches, restart the server, etc.). One gotcha is that this is a user contact (not a group) – many organisations will circumvent this with service accounts, but that’s really not good practice.

Following this, a new computer account should be visible in the directory. The Windows Server 2003 version of Active Directory Users and Computers (ADUC) will see the account as disabled, whereas the Windows Server 2008 tools will report it as an unoccupied DC account. On joining the domain, the computer will be linked with its account and will become an RODC.

The RODC concept relies on a principle called constrained Kerberos delegation, which in turn needs value linked replication – hence the requirement for a Windows Server 2003 domain and forest dunctional level. In addition the requirement for a Windows Server 2008 DC with which to communicate is created as Windows Server 2003 DC will see the RODC as a “normal” computer – e.g. a workstation. Of course, the Windows Server 2008 DC is potentially a single point of failure, so more than one should be deployed.

The constrained Kerberos authentication works as follows:

  • In addition to the krbtgt account that will already exist in the domain (a Kerberos ticket granting service account), each RODC will have its own TGT account created in the form krbtgt_identifier in order to issue its own Kerberos tickets without compromising domain security.
  • If a user attempts to logon at a remote site, their credential
    s will initially be validated by the local RODC.
  • Because password hashes are stripped from RODC replication, if this is the user’s first login attempt, or if they are not in the AllowsRODCPassword group, then the authentication request will be passed across the WAN to a full DC. When the ticket is returned, the RODC asks a full DC running Windows Server 2008 DC replicate a single attribute (the password hash), which is then held for future logins.
  • If a login is authenticated by the RODC then a local Kerberos ticket is issued. This local ticket will not be valid elsewhere on the domain (effectively each RODC becomes a subdomain for authentication purposes) and requests to access other resources will be referred to a full DC running Windows Server 2008.

It is possible to force inbound replication to an RODC for a defined set of users (i.e. to pre-populate the information for users on a particular site); however this information can quickly become stale.

Scotty went on to mention a couple of things to beware of when planning to use RODCs:

  • Because an RODC cannot be written to, some applications will see RODCs as an LDAP server, if an LDAP v3 referral is invoked then many applications will fail.
  • Whilst Exchange Server will treat an RODC as a GC, Outlook will not.

Group policy in Windows Vista

This content is 17 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

Windows Vista makes a number of changes to the implementation and management of group policy objects (GPOs) and, as group policy is something that I haven’t worked with for a while, I figured it was time to take another look. A week or so back, I spent the morning at Microsoft, where Steve Lamb presented a session on using Group Policy in Windows Vista to control user behaviour and network security.

Policy has existed in various versions of Windows for a long time but group policy was introduced in Windows 2000 (enforced by Active Directory) and many group policy settings are also available as local computer policies (used when a machine is not authenticated by an Active Directory domain controller). Each new version of Windows brings more control over what can be controlled using policies and Windows Vista is no exception with a significant increase in the available options (Microsoft quotes various figures but they all indicate at least 2000 new settings). The new areas covered include removable device management, power management and user access control. There are also new management tools the group policy management console (GPMC) is now included with Windows (previously, it was a separate download ) and the group policy editor (gpedit.exe) now supports filtering of administrative template policy settings via a context-sensitive option on the view menu to show, for example, only those settings that apply to at least Windows XP Professional with SP2.

Windows Vista also makes improvements to policy control around network awareness, detecting changes in network conditions (e.g. connecting to a new network) and enforcing new policy settings accordingly. There are also improvements to the application of policy (with fewer requirements for synchronous application of policy).

It’s important to note the difference between a policy – stored in a subfolder (machine or user) on the domain controller under %systemroot%\sysvol\sysvol\domainname\policies\guid\ – and policy definition files – stored at the same location but simply defining the available settings.

Although Windows Vista will still act on legacy (.adm) policy definition files, policy definitions created under Windows Vista use a new XML-based file format with an .admx extension. Furthermore, Windows Vista group policy uses separate .adml files to provide the language-specific textual components of each policy.

When editing policy on a Windows Vista computer, the policy definition files are stored at %systemroot%\policydefinitions\ with one .admx file for each area of control and associated .adml files in each language subfolder (e.g. en-us).

These can be copied to the central store (really just a grand name for the policies folder that is replicated as part of sysvol) in order to make them available for administration from multiple locations. Central store copies of policy definitions will then take precedence over local copies (but legacy clients will be unaffected by the new settings).

Although legacy clients will simply ignore policy settings that they do not understand, Microsoft recommends that once Windows Vista policies are implemented, then no further policy edits should be made from pre-Vista computers. The reasoning for this is that even opening the policy definition on a pre-Vista computer will cause the legacy .adm files to be created on the sysvol and this leads to a phenomenon known as sysvol bloat. By using only Windows Vista clients for group policy management, this bloat can be avoided. It’s also worth noting that GPO reporting should be performed within the Windows Vista version of the GPMC (rather than using the resultant set of policy MMC snap-in) and that new policy backups should be taken using the Windows Vista GPMC to avoid issues when restoring policy backups taken from GPMC running on Windows XP/Server 2003. Further details for managing group policy administrative template (.adm) files can be found in Microsoft knowledgebase article 816662.

For bringing forward settings from legacy (.adm) policy templates, Microsoft has licensed the ADMX Migrator utility (from Full Armor).

Another new feature with Windows Vista group policy is the ability to define multiple local policies (administrator, non-administrator and per-user) and even to disable local policy altogether on domain-joined computers. Whilst the local computer policy remains (and is created by default), further local policies may be created using the group policy editor. This is useful for computers over which some control is required but which fall outside the scope of management for Active Directory (e.g. kiosks or computers deployed in a DMZ).

Troubleshooting group policy is aided with Windows Vista’s improved event logging (with more useful events and links to support information on the Internet) as well as the ability to view events in friendly (human-readable) format or XML (for analysis/processing). The new event viewer also supports the ability to create subscriptions. Actions can also be associated with events (e.g. send an e-mail, or execute a script).

Filters can be used to view just group policy events and by drilling down into the appropriate logfile, an activity ID can be extracted from a failure event to further filter events, or to view with the group policy log view (gplogview.exe) – another free download from Microsoft. This allows for step-by-step group policy processing to identify the failure point and any error codes, after which changes can be made and gpupdate.exe used to apply the new settings for re-analysis.

For enterprise customers, Microsoft has a new tool for advanced group policy management – GPOVault is part of the desktop optimisation pack for software assurance (DOPSA), gained as part of Microsoft’s acquisition of DesktopStandard.

Further information

Microsoft resources:

MVP and community resources:

Using Active Directory to authenticate users on a Linux computer

This content is 18 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

I’m not sure if it’s the gradual improvement in my Linux knowledge, better information on the ‘net, or just that integrating Windows and Unix systems is getting easier but I finally got one of my non-Windows systems to authenticate against Active Directory (AD) today. It may not sound like much of an achievement but I’m pretty pleased with myself.

Active Directory is Microsoft’s LDAP-compliant directory service, included with Windows server products since Windows 2000. The AD domain controller that I used for this experiment was running Windows Server 2003 with service pack 2 (although the domain is still in Windows 2000 mixed mode and the forest is at Windows 2000 functional level) and the client PC was running Red Hat Enterprise Linux (RHEL) 5.

The first step is to configure the Linux box to use Active Directory. I ran this as part of the RHEL installation but it can also be configured manually, or using system-config-authentication. The best way to do this is using LDAP and Kerberos (as described by Scott Lowe) but Scott’s advice indicates that would require some AD schema changes to incorporate Unix user information; the method I used is based on Winbind and doesn’t seem to require any changes on the server as Winbind allows a Unix/Linux box to become a full member of a Windows NT/AD domain.

Winbind settingsThe settings I used can be seen in the screen grab, specifying the Winbind domain (NetBIOS domain name), security model (ADS), Winbind ADS realm (DNS domain name), Winbind domain controller(s) and the template shell (for users with shell access), following which Winbind join I selected the Join Domain button and supplied appropriate credentials and the machine was successfully joined the domain (an error was displayed in the terminal window indicating that Kerberos authentication failed – not surprising as it hadn’t been configured – but the message continued by reporting that it had fallen back to RPC communications and resulted in a successful join).

For reference, the equivalent manual process would have been something like:

  1. Edit the name service switch file (/etc/nsswitch.conf) to include the following:
  2. passwd: files winbind
    shadow: files winbind
    group: files winbind
    netgroup: files
    automount: files

  3. Edit the Samba configuration file (/etc/samba/smb.conf) to include the following configuration lines in the [global] section:
  4. workgroup = DOMAINNAME
    security = ads
    password server = domaincontroller.domainname.tld
    realm = DOMAINNAME.TLD
    idmap uid = 16777216-33554431
    idmap uid = 16777216-33554431
    template shell = /bin/bash
    winbind use default domain = false

  5. Edit the PAM authentication configuration (/etc/pam.d/system-auth) to append broken_shadow to account required pam_unix.so and to insert:
  6. auth sufficient pam_winbind.so use_first_pass
    account [default=bad success=ok user_unknown=ignore] pam_winbind.so
    password sufficient pam_winbind.so use_authtok

  7. Join the domain:
  8. /usr/bin/net join -w DOMAINNAME -S domaincontroller.domainname.tld -U username

  9. Restart the winbind and nscd services:
  10. service winbind restart
    service nscd restart

It’s also possible to achieve the same results using authconfig (as described by Bill Boswell).

Once these configuration changes have been made, AD users should be able to authenticate, but they will not have home directories on the Linux box, resulting in a warning:

Your home directory is listed as:

‘/home/DOMAINNAME/username

but it does not appear to exist. Do you want to log in with the / (root) directory as your home directory? It is unlikely anything will work unless you use a failsafe session.

or just a simple:

No directory /home/DOMAINNAME/username!

Logging in with home = “/”.

This is easy to fix, as described in Red Hat knowledgebase article 5367, adding session required pam_mkhomedir.so skel=/etc/skel umask=0077 to /etc/pam.d/system-auth. After restarting the winbind service, the first subsequent login should be met with:

Creating directory ‘/home/DOMAINNAME/username

The parent directory must already exist; however some control can be exercised over the naming of the directory – I added template homedir = /home/%D/%U to the [global] section in /etc/samba/smb.conf (more details can be found in Red Hat knowledgebase article 4760).

At this point, AD users can log on (using DOMAINNAME\username at the login prompt) and have home directories dynamically created but (despite selecting the cache user information and local authorization is sufficient for local users options in system-config-authentication) if the computer is offline (e.g. a notebook computer away from the network), then login attempts will fail and the user is presented with the following warning:

Incorrect username or password. Letters must be typed in the correct case.

or:

Login incorrect

In order to allow offline working, I followed some advice relating to another Linux distribution (Mandriva disconnected authentication and authorisation) but it still worked for me on RHEL. All that was required was the addition of winbind offline logon = yes to the [global] section of /etc/samba/smb.conf along with some edits to the /etc/pam.d/system-auth file:

  • Append cached_login to auth sufficient pam_winbind.so use_first_pass.
  • Add account sufficient pam_winbind.so use_first_pass cached_login.

These changes (along with another winbind service restart) allowed users to log in using cached credentials (once a successful online login had taken place), displaying the following message:

Logging on using cached account. Network ressources [sic] can be unavailable

Unfortunately, the change also prevented local users from authenticating (except root), with the following strange errors in /var/log/messages:

May 30 11:30:42 computername pam_winbind[3620]: request failed, but PAM error 0!
May 30 11:30:42 computername pam_winbind[3620]: internal module error (retval = 3, user = `username')
May 30 11:30:42 computername login[3620]: Error in service module

After a lot of googling, I found a forum thread at LinuxQuestions.org that pointed to account [default=bad success=ok user_unknown=ignore] pam_winbind.so as the culprit. After I removed this line from /etc/pam.d/system-auth (it had already been replaced with account sufficient pam_winbind.so use_first_pass cached_login), both AD and local users could successfully authenticate:

May 30 11:37:25 computername -- username[3651]: LOGIN ON tty1 BY username

I should add that this configuration is not perfect – Winbind seems to take a minute or so to work out that cached credentials should be used (sometimes resulting in failed login attempts before allowing a user to log in) and it also seems to take a long time to login when working offline, but nevertheless I can use my AD accounts on the Linux workstation and I can log in when I’m not connected to the network.

If anyone can offer any advice to improve this configuration (or knows how moving to a higher domain/forest functional level may affect it), please leave a comment below. If you wish to follow the full LDAP/Kerberos authentication route described in Scott Lowe’s article (linked earlier), it may be worth checking out Microsoft Services for Unix (now replaced by the Identity Management for Unix component in Windows Server 2003 R2) or the open source alternative, AD4Unix.

Duplicate computer name prevents Active Directory domain logon

This content is 18 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

I came across an interesting problem a few nights back… I locked myself out of a Windows XP computer. Here’s how it happened, along with how I got back in.

First, I built a new Windows Server and inadvertently used the same name as an existing Windows XP computer. Then I joined the server to an Active Directory domain (from this point on, the machine that was originally using the computer name is unable to authenticate with the domain as its password will have been overwritten when the duplicate machine joined the domain).

I then turned on the Windows XP computer. Because this machine is a notebook PC and wasn’t connected to the network at the time, I logged in using cached credentials; however after installing a wireless network card and restarting the computer, I was presented with a message that indicated I could not log on to the domain. Unfortunately I didn’t make a note of the exact message at the time, but looking back, I can see the NetLogon event 3210 in the system event log, the description for which which tells me exactly the problem:

This computer could not authenticate with \\domaincontroller.domainname.tld, a Windows domain controller for domain domainname, and therefore this computer might deny logon requests. This inability to authenticate might be caused by another computer on the same network using the same name or the password for this computer account is not recognized. If this message appears again, contact your system administrator.

Realising my mistake, I logged on using a local account and tried to rejoin the domain. Except that I couldn’t, because, as per Microsoft’s advice, I had disabled the local administrator account when I joined the domain and all I had available to me were standard user accounts.

Luckily Daniel Petri has published an article with a workaround for when a Windows computer cannot log on to a Windows Server 2003 domain due to errors connecting to the domain. By removing the network cable and restarting, I could log on as a domain administrator using cached credentials. Then, I enabled the local administrator account and changed the computer name before moving the computer out of the domain and into a workgroup. I then rebooted (with the network cable connected), logged in using the re-enabled administrator account and rejoined the domain (with the new computer name), before disabling the administrator account again.

Phew!

Delegation of Active Directory administration (using Quest ActiveRoles Server)

This content is 18 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

Recently, I’ve been working with a client who has an extraordinarily high number of users with domain administrator rights (i.e. those who are members of the Domain Admins group). The problem is historic and they are in the process of moving from Windows NT to Active Directory (AD); whilst AD allows for delegation of control over objects (although best practice dictates that delegation occurs at organisational unit level), under NT the limit for delegation was the domain.

In order to reduce the number of Domain Admins, I’ve been producing a delegation model for AD administration that is intended to provide a pragmatic balance between the granular control that AD can provide and the access requirements of each support team, yet still remains realistic from a management perspective. One major issue is that, whilst Microsoft provides several-hundred pages of documentation and a delegation of control wizard, there are no native tools to keep track of the objects over which control has been delegated. Consequently it’s often necessary to resort to third party tools.

One such tool is ActiveRoles Server (ARS) from Quest Software. Quest inherited this technology with their acquisition of Aelita Software (they had previously inherited another product, now known as ActiveRoles Direct, when they purchased FastLane Technologies). Installed onto a Windows server (which should be secured as any domain controller would be), the current incarnation of the product, uses a SQL Server database for configuration data (rather than schema extensions as some previous products did) and publishes itself as a connection point object within AD. The configuration database can be mirrored via SQL replication for redundancy, with one server acting as a publisher and one as a subscriber whilst the connection point model allows for load balancing between the two servers.

In terms of management, ARS can be administered using a Microsoft management console (MMC) snap-in, a browser interface, or using AD services interface (ADSI). By default, ARS will bind to the first AD domain controller that it finds, although this can be overridden in the management toolset.

Despite not extending the AD schema, ARS allows additional attributes to be stored for an object. These attributes are placed within the ARS configuration database and can be used for provisioning (e.g. conditional filtering on attributes) or for storing additional information on a user (e.g. staff ID number). Propagation of directory data to other LDAP directories and Microsoft Identity Integration Server (MIIS) are supported via Quick Connect for ActiveRoles Server and Unix support can be provided using through a support pack for Vintela Authentication Services. ARS can also expose attributes that are not normally visible in the standard Active Directory Users and Computers MMC snap-in.

In order to allow for user rights to be elevated as required, user access is proxied via the ARS service account, which should be given the highest level of permissions that will be allowed (e.g. Domain Admins). This means that all access is via ARS, allowing for auditing and reporting of rights use. Quest’s recommendation is that users are not assigned native rights within Active Directory (beyond the standard read-only permissions given to an authenticated user). In this way, all rights can be managed via ARS (otherwise privileged users could circumvent ARS, avoiding any auditing of their actions); however there is also an option for ARS-delegated rights to be propagated to Active Directory if required.

Some ARS terminology includes:

  • Access templates: pre-defined role descriptions controlling what a user can/cannot do. ARS allows further granularity than native AD rights – for example controlling which attributes a particular user can edit on an object (e.g. allowing for self service of certain directory attributes via a web interface).
  • Managed units: query-based filters for management of roles (effectively a virtual OU). This avoids issues whereby best practice recommends delegation at OU level but the OU structure is generally designed with group policy in mind.
  • Policy objects: rules applied to objects as they are created (e.g. when creating a user in a particular OU, add them to certain security groups).
  • Script modules: bespoke code that allows policy objects to be extended beyond the standard capabilities of AD OUs and group policy (e.g. when creating a user account, e-mail the telephone system administrator and ask them to populate the user’s telephone number in AD).

ARS seems pretty powerful but it does have some limitations:

  • Firstly, it operates at the domain level, so delegation of forest-level tasks does not seem to be supported.
  • Secondly ARS is used to provide delegation of control over directory objects – not the resources protected by the directory itself (e.g. file systems). This means that ARS can be used to control the administration of the groups that allow access to a particular resource; but there is nothing that it can do to prevent a sufficiently-privileged user from bypassing ARS and accessing a resource directly.

In reality, this has meant that my client has built part of the delegation model for AD using the Quest tools (the translation of the IT policy and procedures to a provisioning model built around ARS) whilst I have based the administration model for the servers and computers within the domain (as well as forest-wide operations) around Windows groups, with procedural control over the use of privileged and non-privileged accounts.

Although I’ve been working with Active Directory since Windows NT 5.0 beta 2 (about 8 years now), this is the first time I’ve really looked at the administration model. It’s been a difficult process for me – to do it properly requires business analysis skills as well as (and probably more than) technical knowledge. The following links might be useful to anyone else who is looking at delegating AD administrative control:

DNS and operations master roles placement with Active Directory

This content is 18 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

I had a call last night from a client who is implementing Active Directory (AD) in his organisation and was trying to resolve some replication issues. Like so many problems in AD the issue was related to the DNS configuration and once I had made a few configuration changes on the DNS servers to build a forwarding hierarchy from the remote sites to the head office and then on to the ISP, everything started to work.

Whilst I was looking over his domain I also noticed that there was only a single global catalog (GC) server – the first domain controller that he’d installed (the same DC that was holding all the operations master roles, although in his single domain forest the co-hosting of the infrastructure master and GC roles will not cause problems with phantom indexes as described in Microsoft knowledge base article 248047).

Microsoft knowledge base article 825036 describes best practices for DNS client settings in Windows 2000 Server and in Windows Server 2003 whilst Microsoft knowledge base article 223346 discusses the placement and optimisation of operations master roles.

How not to image servers

This content is 19 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

A couple of weeks back, I wrote about using Microsoft’s system preparation tool (SysPrep) to prepare virtual machine images for duplication. It doesn’t really matter whether the machine is virtual or physical, the principle is still the same (my point was that cloning virtual machines using a file copy is easy but needs to be prepared in a specific way – i.e. using SysPrep).

A few days ago I was completely amazed to hear how one of my clients had duplicated some of their servers – they had simply broken a mirror, placed the second disk in a new server, then added another disk in each server to recreate the mirror (repeat until all servers are successfully duplicated). It may be ingenious, but it’s also extremely bad practice.

The client in question is in the process of preparing for a migration from Windows NT to Windows Server 2003 and Active Directory. Although NT doesn’t get too upset if servers are cloned, including their security identifier (SID), Active Directory does. They now have three choices:

  • Rebuild the problem servers.
  • Remove the servers from the domain.
  • Use a tool like Sysinternals NewSID to change the SIDs (both officially unsupported by Microsoft).

Whatever the decision, it’s all extra (and unnecessary) work – completely avoidable.

Removing MOM’s Active Directory management pack helper object

This content is 19 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

A few months back I had a look at Microsoft Operations Manager (MOM) 2005. Then, a couple of weeks back, I noticed that one of my servers had the Microsoft Operations Manager 2005 Agent installed, as well as the Active Directory management pack helper object. I uninstalled the Microsoft Operations Manager 2005 agent from the Add/Remove programs applet in Control Panel, but when I went to remove the helper object I was greeted with the following error (and the MSI Installer logged event ID 11920 in the application log):

Active Directory Management Pack Helper Object
Service ‘MOM’ (MOM) failed to start. Verify that you have sufficient privileges to start system services.

Retrying the operation produced the same error, so I was forced to cancel, then confirm the cancellation, before finally receiving another error message (and the MSI Installer logged event ID 11725 in the application log):

Add or Remove Programs
Fatal error during installation.

The answer was found on the microsoft.public.mom newsgroup – I needed to reinstall the MOM agent before the AD management pack helper object could be removed but there was a slight complication because I no longer have a MOM server (I deleted my virtual MOM server after finishing my testing). Manual agent installation is possible, but I needed to supply false details for the management group name and management server in order to let the installation take place with a warning that the agent would keep retrying to contact the server (all other settings were left at their defaults).

Once the agent installation was complete, it was a straightforward operation to remove the Active Directory management pack helper object, before uninstalling the MOM agent (successfully indicated by MSI Installer event ID 11724 in the application log).

It’s a simple enough workaround but represents lousy design on the part of the MOM agent/management pack installers – surely any child helper object installations should be identified before a parent agent will allow itself to be uninstalled?

Problems accessing the Virtual Server administration website on a Windows Server 2003 domain controller

This content is 19 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

Although I have several computers at home, most of my server roles are running on a single PC. That means Active Directory (AD) domain controller (DC), DNS, DHCP, RIS, WSUS, and print services are all on one box (file services are on my NSLU2) so I figured that adding Virtual Server 2005 R2 to the mix shouldn’t be too big a problem. It’s certainly not good practice, but it works.

Another bad practice is to run internet information services (IIS) on a DC, but I already have IIS installed for WSUS, so adding the Virtual Server administration website should have been reasonably straightforward. Following installation, existing websites on the server were working as expected but any attempt to access the Virtual Server 2005 administration website resulted in an HTTP Error 403 – Forbidden: Access is denied. message, despite entering the domain administrator credentials when prompted (and already being logged on as the domain administrator).

From checking the event log, I found that Virtual Server was logging the following event on startup:

Event Type: Warning
Event Source: Virtual Server
Event Category: Virtual Server
Event ID: 1130
Date: 01/05/2006
Time: 15:28:23
User: NT AUTHORITY\NETWORK SERVICE
Computer: SERVER1
Description:
The service principal names for Virtual Server could not be registered. Constrained delegation cannot be used until the SPNs have been registered manually. Error 0x80072098 – Insufficient access rights to perform the operation.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

I tried the steps in Microsoft knowledge base article 890893 but adding the appropriate SPNs to AD didn’t seem to make any difference.

A bit of Googling turned up a blog entry from David Wang which although not completely relevant, contained a reference to a similar problem in the comments. Sure enough, when I checked the IIS logs, the error code was 403 19, as shown below:

#Fields: date time s-sitename s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status
2006-05-01 21:29:39 W3SVC2 ipaddress GET /VirtualServer/VSWebApp.exe view=1 1024 domainname\Administrator ipaddress Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.2;+SV1;+.NET+CLR+1.1.4322) 403 19 1314

I tried David’s advice of switching the IIS DefaultAppPool identity to LocalSystem and that worked (LocalSystem is a very highly-privileged account), but (despite my lackadaisical approach to co-hosting services and the probably security implications) I didn’t really feel that it was an ideal solution and I switched back to Network Service. I then set about trying to work out why the Network Service account (NT AUTHORITY\NETWORK SERVICE) didn’t have the appropriate permissions. Microsoft knowledge base article 332097 looked as if it might be relevant (Microsoft knowledge base article 842493 is similar) but didn’t seem to solve the problem (in any case the IIS_WPG group already had the correct permissions) so I fired up the Local Security Settings MMC snap-in and checked out the user rights assignment in the local security policy.

Because my IIS server is also a DC, many of the user rights normally associated with the Network Service account had been removed (and were overridden by the Default Domain Controllers Policy). NT AUTHORITY\NETWORK SERVICE was also missing from the IIS worker process group (IIS_WPG) membership (and could not be added as it is a local account) so I edited the local security policy and the Default Domain Controllers Policy (another bad practice – I should really have created a new policy for DCs running IIS) as follows:

  • Replace a process-level token (Default Domain Controllers Policy).
  • Adjust memory quotas for a process (Default Domain Controllers Policy).
  • Generate security audits (Default Domain Controllers Policy).
  • Log on as a batch job (Default Domain Controllers Policy).
  • Impersonate a client after authentication (local security policy).

The following user rights were already in existence:

  • Bypass traverse checking (inherited from Everyone).
  • Access this computer from the network (inherited from Everyone).
  • Log on as a service (Default Domain Controllers Policy).

After forcing a group policy refresh (using gpupdate /force) and issuing the iisreset command, I was able to access the Virtual Server administration website as expected; although the event 1130 warnings are still being recorded in the event log, along with event 1129 since I enabled the virtual machine remote control (VMRC) server:

Event Type: Warning
Event Source: Virtual Server
Event Category: Remote Control
Event ID: 1029
Date: 04/05/2006
Time: 21:19:18
User: NT AUTHORITY\NETWORK SERVICE
Computer: SERVER1
Description:
The service principal name for the VMRC server could not be registered. Automatic authentication will always use NTLM authentication. Error 0x80072098 – Insufficient access rights to perform the operation.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

I stress that running multiple services on a single PC (even with proper server hardware) is not a good idea; nor is running IIS on a DC; and neither is editing either the Default Domain Policy or the Default Domain Controllers Policy. If you need to do it though, hopefully these notes will help to work out why processes that rely on the Network Service account are not working as they should.

Maximising Active Directory performance and replication troubleshooting

This content is 19 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

Whenever I see that John Craddock and Sally Storey (from Kimberry Associates) are presenting a new seminar on behalf of Microsoft I try to attend. Not just because the content is well presented but because of the relevance of the material, some of which would involve wading through reams of white papers to find, but most importantly because, unlike most Microsoft presentations, there’s hardly any marketing material in there.

Last week, I saw John and Sally present on maximising Active Directory (AD) performance and in-depth replication troubleshooting. With a packed day of presentations there was too much good information (and detail) to capture in a single blog post, but what follows should be of interest to anyone looking to improve the performance of their AD infrastructure.

At the heart of Active Directory is the local security authority (LSA – lsass.exe), responsible for running AD as well as:

  • Netlogon service.
  • Security Accounts Manager service.
  • LSA Server service.
  • Secure sockets layer (SSL).
  • Kerberos v5 authentication.
  • NTLM authentication.

From examining this list of vital services, it becomes clear that tuning the LSA is key to maximising AD performance.

Sizing domain controllers (DCs) can be tricky. On the one hand, a DC can be very lightly loaded but at peak periods, or on a busy infrastructure, DC responsiveness will be key to the overall perception of system performance. The Windows Server 2003 deployment guide provides guidance on sizing DCs; however it will also be necessary to monitor the system to evaluate performance, predict future requirements and plan for upgrades.

Performance monitor is a perfectly adequate tool but running it can affect performance significantly so event tracing for Windows (ETW – as described by Matt Pietrek) was developed for use on production systems. ETW uses a system of providers to pass events to event tracing sessions in memory. These event tracing sessions are controlled by one or more controllers, logging events to files, which can be played back to consumers (or alternatively the consumers can operate real-time traces). Microsoft server performance advisor is a free download which makes use of ETW providing a set of predefined collectors.

Some of the basic items to look at when considering DC performance are:

  • Memory – cache the database for optimised performance.
  • Disk I/O – the database and log files should reside on separate physical hard disks (with the logs on the fastest disks).
  • Data storage – do applications really need to store data in Active Directory or would ADAM represent a better solution?
  • Code optimisation – how do the directory-enabled applications use and store their data; and how do they search?

In order to understand search optimisation, there are some technologies that need to be examined further:

  • Ambiguous name resolution (ANR) is a search algorithm used to match an input string to any of the attributes defined in the ANR set, which by default includes givenName, sn, displayName, sAMAccountName and other attributes.
  • Medial indexing (*string*) and final string searching (*string) are slow. Windows Server 2003 (but not Windows 2000 Server) supports tuple indexing, which improves perfomance when searching for final string character strings of at least three characters in length. Unfortunately, tuple indexing should be used sparingly because it does degrade performance when the indexes are updated.
  • Optimising searches involves defining a correct scope for the search, ensuring that regularly-searched attributes are indexed, basing searches on object category rather than class, adjusting the ANR as required and indexing for container and medial searches where required.
  • Logging can be used to identify expensive (more than a defined number of entries visited) and inefficient (searching a defined number of objects returns less than 10% of the entries visited) searches using the 15 Field Engineering value in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Diagnostics (as described in Microsoft knowledge base article 314980), combined with HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\Expensive Search Results Threshold (default is 10000) and HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\Inefficient Search Results Threshold (default is 1000).

However performant the directory is at returning results, the accuracy of those results is dependent upon replication (the process of making sre that the directory data is available throughout the enterprise).

The AD replication model is described as multimaster (i.e. changes can be made on any DC), loosely consistent (i.e. there is latency between changes being made and their availability throughout the enterprise, so it is impossible to tell if the directory is completely up-to-date at any one time – although urgent changes will be replicated immediately) and convergent (i.e. eventually all changes will propagate to all DCs, using a conflict resolution mechanism if required).

When considering replication, a topology and architecture needs to be considered that matches performance with available bandwidth, minimises latency, minimises replication traffic and responds appropriately to loosely connected systems. In order to match performance with available bandwidth and provide efficient replication, it is important to describe the network topology to AD:

  • Sites are islands of good connectivity (once described as LAN-speed connections, but more realistically areas of the network where replication traffic has no negative impact).
  • Subnet objects are created to associate network segments with a particular site (e.g. so that a Windows client can locate the nearest DC for logons, directory searching and DFS paths – known as site affinity). Client affinity is also used for logons, directory searching and DFS paths.
  • Site links characterise available bandwidth and cost (in whatever unit is chosen – money, latency, or something else).
  • Bridgehead servers are nominated (for each NC) to select the DC used for replication in an out of a site when replicating all NCs between DCs and replicating the system volume between DCs.

A communications transport is also required. Whilst intrasite communications use RPC for communications, intersite communications (i.e. over site links) can use IP (RPC) for synchronous inbound messaging or SMTP for asynchronous messaging; however SMTP can not be used for the domain NC, making SMTP useful when there is no RPC connection available (e.g. firewall restrictions) but also meaning that RPC is required to build a forest. Intersite communications are compressed by default but Windows 2000 Server and Windows Server 2003 use different compression methods – the Windows Server 2003 version is much faster, but does not compress as effectively. In reality this is not a problem as long as the link speed is 64Kbps or greater but there are also options to revert to Windows 2000 Server compression or to disable compression altogether.

Site link bridges can be used to allow transitive replication (where a DC needs to replicate with a its partner via another site). The default is for all links to be bridges; however there must be a valid schedule on both portions of the link in order for replication to take place.

The knowledge consistency checker (KCC), which runs every 15 minutes by default, is responsible for building the replication topology, based on the information provided in the configuration container. The KCC needs to know about:

  • Sites.
  • Servers and site affinity.
  • Global catalog (GC) servers.
  • Which directory partitions are hosted on each server.
  • Site links and bridges.

For intrasite replication, the KCC runs on each DC within a site and each DC calculates its own inbound replication partners, constructing a ring topology with dual replication paths (for fault tolerance). The order of the ring is based on the numerical value of the DSA GUID and the maximum hop count between servers is three, so additional optimising connectors are created where required. Replication of a naming context (NC) can only take place via servers that hold a copy of that NC. In addition, one or more partial NCs will need to be replicated to a GC.

Because each connection object created by the KCC defines an inbound connection from a specific source DC, the destination server needs to create a partnership with the source in order to replicate changes. This process works as follows:

  1. The KCC creates the required connections.
  2. The repsFrom attribute is populated by the KCC for all common NCs (i.e. the DC learns about inbound connections).
  3. The destination server requests updates, allowing the repsTo attribute to be populated at the source.

The repsTo attribute is used to send notifications for intrasite replication; however all DCs periodically poll their partners in case any changes are missed, based on a schedule defined in the NTDS Settings object for the site and a system of notification delays to avoid all servers communicating changes at the same time.

Windows 2000 Server stores the initial notification delay (default 5 minutes) and subsequent notification delay (default 30 seconds) in the registry; whereas Windows Server 2003 stores the initial notification delay (default 15 seconds) and subsequent notification delay (default 3 seconds) within AD (although individual DCs can be controlled via registry settings). This is further complicated by the fact that Windows 2000 Server DCs upgraded to Windows Server 2003 and still running at Windows 2000 forest functionality will take the Windows 2000 timings until the forest functionality mode is increased to Windows Server 2003. This means that with a three hop maximum between DCs, the maximum time taken to replicate changes from one DC within a site to another is 45 seconds for Windows Server 2003 forests and 15 minutes for Windows 2000 Server forest (based on three times the initial notification delay).

For certain events, a process known as urgent replication (with no initial notification delay) is invoked. Events that trigger urgent replication are:

  • Account lockout.
  • Changes to the account lockout policy or domain password policy.
  • Changes to the computer account.
  • Changes to domain trust passwords.

Some changes are immediately sent to the PDC emulator via RPC (a process known as immediate replication) and most of these changes also trigger urgent replication:

  • User password changes.
  • Account lockout.
  • Changes to the RID master role.
  • Changes to LSA secrets.

For intersite replication, one DC in each site is designated as the inter-site topology generator (ISTG). The KCC runs on the ISTG to calculate the inbound site connections and the ISTG automatically selects bridgehead servers. By default, notifications are not used for intersite replication (which relies on a schedule instead); however it is also possible to create affinity between sites with a high bandwidth backbone connection by setting the least significant bit of the site link’s option attribute to 1.

Be aware that schedules are displayed in the local time, so if configuring schedules across time zones be aware that the time in once site will not match the time in the other. Also be aware that deleting a connector may orphan a DC (e.g. if replication has not completed fully and it has insufficient knowledge of the topology to establish a new connection).

Once the replication topology is established, a server needs to know what information it needs to replicate to its partners. This needs to include:

  • Just the data that has been changed.
  • All outstanding changes (even if a partner has been offline for an extended period).
  • Alternate replication paths (without data duplication).
  • Conflict resolution.

Each change to directory data is recorded as a unique sequence number (USN), written to the metadata for each individual attribute or link value. The USN is used as a high watermark vector for each inbound replication partner (and each NC), identified by their DSA GUID. The source server will sent all changes that have a higher USN. Because replication works on a ring topology, a process is required to stop unnecessary replication. This is known as propagation dampening and relies on another value called the up-to-dateness vector (one for each DC where the information originated). This is used to ensure that the source server does not send changes that have already been received. The highest committed USN attribute holds the highest USN used on a particular server.

It is possible for the same attribute to be simultaneously updated at multiple locations so each DC checks that the replicated change is “newer” than the information it hold before accepting a change. It determines which change is more up-to-date, based on the replica version number, then the originating time stamp, and finally on the originating invocation ID (as a tie break).

Other replication issues include:

  • If an object is added to or moved to a container on one DC as the container is deleted on another DC then the object will be places in the LostAndFound container.
  • Adding or moving objects on different DCs can result in two objects with the same distinguished name (DN). In this case the newer object is retained and the other object name is appended with the object GUID.

It’s worth noting that when running in Windows Server 2003 forest functional mode significant reductions in replication traffic can be provided as changes to multi-value objects (e.g. group membership) are replicated at the value level rather than the attribute level. Not only does this reduce replication traffic but it allows groups to be created with more than 5000 users and avoids data loss when a group membership is edited on multiple DCs within the replication latency period.

If this has whetted your appetite for tuning AD (or if you’re having problems troubleshooting AD) then I recommend that you check out John and Sally’s Active Directory Forestry book (but beware – the book scores “extreme” on the authors’ own “geekometer scale”).