Currently coding the site

User Registration Patterns (URP)

Nov 24, 2007

In this article we will try to prove that some practices in current web apps have important privacy and security issues both for the user and for the application. To illustrate these issues better we will first analyze the patterns that the main Web 2.0 companies have used when implementing the ‘user registration process’. Then, we will go over some of the aspects of the process, such as authentication flow, email confirmation, captchas and sharing information with third parties. This will hopefully help you decide which practices you should use when implementing your app.

50 apps

Before you write a single line of code you will have to make some high-level decisions that determine what features you will need in the user registration. If you are planning on building several applications, you’d better review the user registration requirements for each of them, instead of reusing the same code and patterns for all of them.

Most of the apps in the following table have millions of users registered. In order to handle such a large number of users, they had to change some old school strategies and adopt new ones. For instance, in order to avoid the user picking a username that is not available, they suggest that the user use an email address as username (due to its uniqueness). Also, in order to attract as many users as possible, they try to minimize the registration process to a one step registration process (if possible). As we will see, if we apply the two previous concepts the wrong way the result can create some security or privacy flaws.

Click on the name of each app to see a screenshot of its sign up form:

Leave a comment if you want to complete the language column.

App Used as login Email confirmation needed Privacy data Required personal data Captcha Language / Framework
App Used as login Email confirmation needed Privacy data Required personal data Captcha Language / Framework
Note: This table doesn’t include any Google or Yahoo! properties, since each use common accounts to login. It also doesn’t include those apps that don’t require any kind of user registration (yubnub, popurls, etc). Please feel free to send any corrections of the content of the table.

Authentication flow

Case 1: confirmation: no. login: email.

(i.e. twitter, netvibes, hi5)

This pattern includes the apps that send a confirmation email to the user but allow the user to login and use the app anyway, even if they haven’t gone through the confirmation process.

This practice allows any user to use a non-confirmed email as his unique username. That means if I happen to know Bill Gates‘ email, for instance, then I will be able to use his email as my username in all these apps; at the same time, he won’t be able to register with his own email. In some cases the apps might have a system for expiring those non-confirmed accounts but we only found one case where this was explicitly shown to the user (imeem.com):

“You have 10 days left to verify your email.”

One of the solutions used by some apps goes as follows:
The user is able to access the app but only the more basic features. As soon as the user tries to perform any advanced action, or one that implies the use of the email (like recovering your password) then they are asked to confirm the email address. This approach is not enough to solve the initial drawback. The potential real owner of the email is still blocked from registering with his own email.

Fig 1. Used as login.

Case 2: confirmation: no (blocking). login: username.

(i.e. Remember the milk)

In this case we seem to be preventing some possible identity theft issues. The problem, though, is that many of these applications save the given email coupled with the profile data, thus not allowing a new user (maybe the real owner) to register with an email that has already been used before:

“Sorry but this email is already in use”.

This shouldn’t happen unless the owner of the email address has performed some sort of confirmation.

Case 3: confirmation: no (no blocking). login: username.

(i.e. Technorati)

This is the same case as the previous one, with one difference: the registration process allows the new registrant to use an email that has already been used before.
The email shouldn’t be relevant to any part of the application. If it is, the user should be forced to confirm the email address as soon as they use any feature related to it, such as notifications and reminders. Also, the email should not be publicly visible in the user’s profile until it is confirmed.

Case 4: confirmation: yes. login: email/username.

(i.e. Youtube, pbWiki)

This is the best practice so far. The user is registered in a two-step process. They fill in the form and then confirm the registration by clicking on the provided link in an email. There are two bad things about this pattern. The first one is obvious: the user will have to go through a longer process than the one-step registration. The second issue is explained in the next section.

Fig 2. Email confirmation.

Confirmation process.

As we have seen in the last case, once we fill in the form an email is sent to our inbox with a confirmation link that completes the registration process. In some cases, an auto-generated password or username will be sent in the email instead of a link (i.e. stumbleupon.com). The goal in both cases is to prove that the user owns a given email account.

We can draw a parallel here to the public key cryptography:

Fig 3. Public-key Cryptography [Source: Wikipedia]
“The most important thing to know about public key cryptography is that unlike earlier cryptographic systems, it relies not on a single key (a password or a secret “code”), but on two keys. These keys are numbers that are mathematically related in such a way that if either key is used to encrypt a message, the other key must be used to decrypt it. (…) By making one of the keys available publicly (a public key) and keeping the other key private (a private key), a person can prove that he or she holds the private key simply by encrypting a message. If the message can be decrypted using the public key, the person must have used the private key to encrypt the message.” [Source: The globus alliance]

In our case, the email address is the public key that we share with the app to send us a message with an exclusive link for us in it. We will be the only ones to read it (”decrypting it”), by accessing our email account through the email password (the private key). Clicking on the link will prove us as the owner.

The problem is if we misspelled the email in the initial form. This is something that has happened to everyone at least once. In that case the email confirmation will be sent to another (potentially existing) account. If the owner of this other account completes the process, then they will be able to access the application and the data stored in our profile that we have input in the initial form. They will be also able to perform the password recovery. That’s why some apps ask you to type your email twice, as if it were a password.

Fortunately, none of the apps that we analyzed ask you for confidential data, like the street where you live or your phone number. But most of them ask you for your zip code, your date of birth or your full name. These data can also be considered sensitive for the privacy of the user and can be misused by the fake user.

It’s clear that something needs to be changed in the authentication flow to ensure that anyone who wrongly gets the password isn’t able to login into the account so easily.

Accessing a third party’s contact list

There is another practice with quite a few applications: After the registration is done, the app will allow the user to import their contacts from other accounts (gmail, hotmail, etc.). In this situation the interface should be very unobtrusive to the user. It’s very important to make the user confident that all their privacy issues are under control. The truth is that asking for the user data from a third party account is aggressive enough.

There are some apps that need to give a clearer message that none of this information will be stored for any use other than import the contacts in the actual process. More importantly, a ’skip this step’ message, if it exists, should be very noticeable (instead of being the very small letter at the bottom of the page) so that the user can have the freedom to decide whether or not to commit this process.

As an example of a bad practice we have Tagged.com, which uses a very aggressive method: after finishing the registration process, we are redirected to the ‘add contacts from third parties’ process. In this stage we are not provided with any link that allows us to skip the form. There is no link to go back to the home page, and not even the main company logo is linkable.

Captcha

There are still quite a few apps that don’t prevent spambots from automatically registering with their sites. Surprisingly, those that do use captchas sometimes use the same image for every registration page in the same process. That is the case when the user fills any form field incorrectly and the application gives the user another opportunity to do it. This is a really bad practice since, in order to deter spambots, a given captcha image should be shown only once, and it should always be changed to a new image if an incorrect response is given.

Fig 4. Use of captcha.

User privacy

Web application companies will eventually have a huge amount of users registered, if they are popular. The companies store the user data in their databases. PrivacyInternational.org recently released an excellent study about privacy practices of key Internet based companies. They analyze details like data retention, responsiveness, ethical compass and many others.

In our case, we will focus in those elements presented to the user in the initial form of the registration process. In this step of the process the company needs to be as transparent as possible to the user, giving them easy access to the terms of service and privacy policy documents. As the pie chart indicates there is still a 24% of apps that don’t clearly provide the user any link or information to the privacy policy or the terms of service in the sign up form. On top of that, those that do provide this information (76%) don’t necessarily do it in an active way. 31% of them don’t include a check box so the user can select and explicitly accept the company policies or their terms of services in order to complete the sign up process.

Fig 5. Display of TOS/PP.

Alternatives

As we have seen in this article, we as web developers have to foresee every kind of action our users are able to perform with the choices we give them. We have to test and prevent all possible hacks and errors. Check if these actions have poor security or privacy issues for the user and fix them. With all these principles in mind it’s evident that the patterns used for most of the current well-known apps are not completely prepared for that.

Some alternatives like OpenID and OAuth are coming up with their open specification to try to change the current paradigms. Each of them, though, are meant to solve different issues:

“While OpenID is all about using a single identity to sign into many sites, OAuth is about giving access to your stuff without sharing your identity at all (or its secret parts). (…) OAuth talks about getting users to grant access while OpenID talks about making sure the users are really who they say they are.” [Source: OAuth.net - about]

That means that while OpenID is an alternative of the four different authentication flows we analyzed, OAuth would take care about the ‘Accessing the third party’s contact list‘ issue we have seen before.

OAuth released a draft so apps can use a common protocol instead of different proprietary protocols used so far for the same purpose (Google AuthSub, AOL OpenAuth, Yahoo BBAuth, Upcoming API, Flickr API, Amazon Web Services API, etc). OpenID has been available longer but, unfortunately, only one out of the fifty analyzed apps (mag.nolia.com) showed very clearly, in its registration process, the ability to use OpenID (with a big OpenID logo). Not even livejournal.com, who is an OpenID provider, had it as visible as it should be.


5 Responses to “User Registration Patterns (URP)”

  1. Robert Says:

    Facebook is made in php and, I think that 43places, twitter and odeo are RoR.
    :)

  2. Nate Koechley Says:

    Hi Ernest,

    This is an outstanding piece of research. Thanks very much for taking the time to do all the work. (Gathering all those screenshots must have taken forever, and that’s just one piece!)

    As I wrote in the summary when I tagged this on delicious, “outstanding audit of the registration, authentication, privacy, and security flows of 50 leading web sites, resulting in a great understanding of the design patterns involved.”

    thanks
    nate

  3. Peter Nixey Says:

    Great article Ernest. I’m always surprised how little “best practice” documents there are out there for registration and authentication processes and you’ve identified a load of interesting elements relevant to both.

  4. James Thompson Says:

    I wonder whether a three stage registration process might not be a good practice:

    1.) Request an account by entering your email and responding to a CAPTCHA prompt.
    2.) Confirm you intentions by responding to a confirmation email.
    3.) Supply needed information to the site after coming back via the confirmation email

    This would seem to ensure the “realness” of a user and prevent the user from possibly disclosing personal data by mistake.

  5. Ernest Delgado Says:

    @James: That’s a very good point. With a simple and consistent process one can avoid most of the issues previously exposed.

    The app would only need to use a good captcha tool, ensure that there’s a mandatory email confirmation process, and ask for any sensitive information only after the user has confirmed his account.

    Surprisingly, very few apps strictly follow all these simple steps.

Comments are currently closed.