Creating an iPhone video chat app using Parse and Opentok (tokbox)

3 Nirav Apr 18, 2014

Creating an iPhone video chat app isn't rocket science, but there are a lot of intricacies to account for. This tutorial covers the entire purpose of developing such an app from start to finish. It is quite long, so you might want to bookmark it so you can revisit it later.

The end result: The entire project can be downloaded here. You can browse if you want to figure it out for yourself, for those who are interested in a detailed walkthrough, read on.

Video chat is one of the most popular forms of communication on mobile, and is for productivity apps as well as social networking.

So how does one go about building such an app? The question is more of design than of technical implementation when many online services provide a lot of the infrastructure needed.

By the end of this tutorial, our app would look like this:

iPhone video chat

Yes, that is a monkey chatting with a mammoth. Which is basically what we would be doing communicating with the services composing our app.

The Basics of a video Chat app:

While the top video chat application use their own streaming servers, we will be using a cloud-based solution to handle that part of the equation. Opentok is the service that we'll be using in this tutorial, and it provides a nice, easy-to-integrate SDK for adding video streaming capabilities to our app.

While Opentok provides their iOS API and iOS SDK with two useful examples of iOS implementation (here and here), there is still one piece left: User management. Opentok provides nice platform for video chat, but between whom? Let's take a deeper dive into how streaming actually works between users of services like Skype, MSN, and Yahoo video chat.

A streaming server works on fundamentals of an Http web server: In a pretty raw representation, all it recognizes is request and response, without caring who sent it. An Http server does just that. It does not worry about states, userID, password or any such access token mechanism. In other words, session management is absent.

The scenario changes when there arises a need to identify the user, authenticate him, and keep him authenticated during his entire browsing experience. A server needs to remember a user, and whatever other entities that come linked with him. These requirements become more stringent while developing a mobile software like iPhone video chat, where identities are quite crucial - they can make or break the authenticity of your app. In large web portals such as Yahoo.com or Amazon.com, dedicated authentication servers generate unique identity for users, and supply application servers with these goodies so that they can track the users until they log out.

To perform its task, a streaming server relies on only one entity to recognize who sent the request and whom to respond - this entity is session ID. It does not recognize or want to care who generated it. As far as it receives a valid sessionID, it keeps replying to streaming requests. And here comes rule of thumb for any chatting app:

Any user with a valid sessionID is, in principle, automatically entitled to view other user’s feed who is also using the same sessionID. Session IDs are of the form:

2_MX4yNjU5MzIyMn5-VHVlIEFwciAyMyAwMjoxMTo0NCBQRFQgMjAxM34wLjEzMTIwMTYyfg

You probably know where this is going: To keep track of who wants to connect to whom in a more real-worldly way, like we do in Yahoo Messenger or Skype, we need another server that keeps track of users. User management is essential in any social networking app. It depends on you how much you want to do it - you can store big data including address, phone numbers, profile pics and the likes, or you can pick 3-4 fields of your choice to make it easier on the user. But the crux of the matter is you need to do it. Any software that relies on user interaction cannot do without it - and so is our iPhone video chat.

To handle user management, we need a central server. To our great comfort, there are number of cloud solutions available again. For the purpose of this tutorial, I have chosen Parse.com because it boasts of 100k+ apps live, and is backed by Facebook. It claims easiest of the APIs, and those claims, as I have experienced, are right. What's more, it's cross platform, so if you plan to make Android or Windows mobile client for your video chat app, you are in safe hands.

The objectives of this tutorial are to show:

  • How to enable Parse.com users see each other (Your favorite messenger's who’s online sort)
  • How to make them talk with each other using Opentok video streaming feature in your next great iphone video chat app

Tokbox provides a Broadcast tutorial which contains some of what we need for a fully featured chap app. The main difference is that we're creating a two-way chat instead of a one-to-many broadcast system. I'll also try to simplify the process so that even novice developers should be able to finish it without much trouble.

Setup (Parse.com and Tokbox.com) for your iPhone video chat app - LiveSessions:

My showcase app - Livesessions - requires some configuration on both Tokbox.com and Parse.com profiles - the server side. Parse.com needs your app, and so does Tokbox.com, although it names it a Project.

I assume you know enough to configure an app on Parse.com. In addition to it, you also need 2 data tables - ActiveUsers and ActiveSessions, though there is no need to pre-populate them at any point. Here is a snap of what their column lists will look like:

ActiveUsers

  • userID - String
  • userLocation - Geopoint
  • userTitle - String

ActiveSessions

  • callerID - String
  • callerTitle - String
  • receiverID - String
  • sessionID - String
  • publisherToken - String
  • subscriberToken - String
  • isVideo - Boolean
  • isAudio - Boolean

Similarly, on Tokbox.com, once you login, go to dashboard and create new project, it will automatically create an API key and API secret. Note that these are your credential as a Tokbox developer, not a chat user. API key is more like your user ID and API secret is a password.

The resulting Tokbox dashboard screen will look like this (the real API key and text are hidden):

Tokbox API Keys

iPhone video chat

There is one, crucial piece left to be done to finish the server side. As we discussed earlier, we need to connect our streaming application server (Tokbox.com) with authentication server (Parse.com). Unless this is done, streaming server has no way to know which user is calling whom because all it knows about is session ID. Our iPhone video chat user, on the other hand, only knows about his own user credentials supplied to him by Parse.com.

As it is apparent, it is Parse.com’s job to connect the two. That is:

  • To intercept logged in user’s request to chat (the caller).
  • To convert it into Tokbox session ID, get the session ID and other necessary information (the token) from Tokbox.
  • To send the caller and receiver (Parse.com users) the sessionID.
  • Since both users now have a session ID and a token which is received from Parse.com, they can seamlessly communicate via Tokbox streaming server.

Chat Architecture

To accomplish above, Parse.com must obtain sessionID, publisherToken and subscriberToken from tokbox, and that is where Parse cloud code comes into picture. In Parse Cloud code section, you must upload some code so that the resulting screen looks like this:

iPhone video chat

The process of uploading (oops, deploying) cloud code on Parse.com is described in this source tutorial (Setting Up section) and here, and it is far better than I could explain here. So let’s skip it to maintain the scope. However for simplicity's sake, I have included it step by step in readme.txt that comes within the code.

I would still take a few moments to explain what this code does, and why. The iPhone app user initiates a call to another user. No, he does not make a phone call. Remember, it’s LiveSession app’s job to handle the entire call, just like Skype or Yahoo messenger does. What our iPhone app needs to do under the hood is fairly simple task: it is saving a row to ActiveSessions Parse table we just created. There, it stores caller user ID (callerID) and receiver user ID (receiverID) among other things.

Now what this cloud code does is something quite magical, yet simple. It intercepts the Save operation in its beforeSave cloud trigger - nicely elaborated here. From within beforesave, Opentok javascript API takes over. Opentok supplied function createSession generates a session ID. Another function, Opentok.generateToken, creates a publisher token or subscriber token, depending on the role argument passed, which decides whether you want to publish your own video feed (Opentok.ROLE.PUBLISHER) or see other user’s video feed (Opentok.ROLE.SUBSCRIBER) using that session ID. (It is unclear from my experience if there is any difference between the two. For the scope of this discussion let us generate both as we need two-way video feeds anyway.)

Since we must remember we are within beforeSave trigger, we already have the handle to the object being saved - ActiveSessions. By simply setting its respective members - we can save three distinct items to our Parse.com database: sessionID, publisherToken and subscriberToken - all three columns in ActiveSession table.

Well - that’s for that. What? You are done with the back end of your first iPhone video chat app! Congratulations!

But how? All we required for two user’s to connect via streaming server is a user id (session id) and a password (token) - and we have both. Now all we have to do is - avail them to iOS client via Parse.com. Not that it’s quite easy as 1-2-3, but the mammoth has bitten the dust already.

Serve yourself a cup of hot coffee as you wait to see one of your friends on your iPhone video app screen! Well, not quite quickly, but before you jump in, you can do them some favor: read this disclaimer if it can help any of your Android friends:

iOS client - the other part of the deal:

Now that the back end is accomplished, let’s focus on how iOS client - our own LiveSession iPhone app - keeps its part of the deal. The major tasks we aim to cover are:

1) Initiate iPhone video chat call - by saving an ActiveSession row (we already described the server part of it above as part of cloud code)

2) Handle Incoming Call - check Parse.com database for an incoming video call - described down the line.

Let’s tackle each of them - one by one. But first and foremost, let’s setup the basic iOS project and go over what’s all needed.

iPhone Video Chat Initial Project Setup:

Open XCode, setup a single View application with default options. Name it LiveSessions. In storyboards, set up 2 scenes: One for users list - named LSViewController (derived from UIViewController), and another for hosting video chat view, LSStreamingViewController (again, subclass of UIViewController).

Next, add a UITableView object to LSViewController scene through storyboard. This table view must maintain a list of Parse.com users who log into LiveSession app any time. LSViewController will handle all the chores related to UITableView, nothing unusual. Do not forget to implement UITableViewDelegate and UITableViewDatasource protocols in LSViewController so as to handle table view related stuff.

In addition to the above, we also need a helper class called ParseHelper which wraps our calls to Parse. All of its members and functions would be static.

To use Parse and Opentok framework, we need to link them, as well as some of the required frameworks used by both of them.

Parse framework can be obtained from here.

Opentok SDK can be obtained from here. This link also explains at length what all you need to do in order to link Opentok framework successfully.

Next, you should add some libraries to LiveSessions by selecting it, going to Build Phases->Link Binary section. After adding number of frameworks, this section should look like this:

Frameworks

At the end of linking everything, your Project tree should look like this:

Project Tree

(Notice the Opentok.bundle thing that come as part of Opentok sdk. Also see three other libraries that go below it in order to link everything together).

Before we proceed, let’s have one look at how beautiful (!) our storyboard looks:

Storyboard

Step 1 - Initiate Video Call:

Parse.com acts as a mediator between caller, Opentok streaming engine and receiver. The first and foremost requirement is to generate sessionID, publisherToken and subscriberToken, so that both clients can seamlessly connect via Opentok once session is established. We already know the Parse.com cloud code does that. But how will LiveSessions app invoke the cloud code?

The following code not only stores an ActiveSessions object to Parse, but also invokes cloud code (beforeSave trigger) we discussed above that generates sessionID, publisherToken and subscriberToken - and they are eventually stored into ActiveSessions table itself.

//ParseHelper.m
//will initiate the call by saving session
//if there is a session already existing, do not save,
//just pop an alert
+(void)saveSessionToParse:(NSDictionary *)inputDict
{    
    NSString * receiverID = [inputDict objectForKey:@"receiverID"];

    //check if the recipient is either the caller or receiver in one of the activesessions.
    NSPredicate *predicate = [NSPredicate predicateWithFormat:
                              @"receiverID = '%@' OR callerID = %@", receiverID, receiverID];
    PFQuery *query = [PFQuery queryWithClassName:@"ActiveSessions" predicate:predicate];

    [query getFirstObjectInBackgroundWithBlock:^
    (PFObject *object, NSError *error)
    {
        if (!object)
        {
            NSLog(@"No session with receiverID exists.");
            [self storeToParse:inputDict];
        }
        else
        {
           [[NSNotificationCenter defaultCenter] postNotification:[NSNotification notificationWithName:kReceiverBusyNotification object:nil]];
           return;
        }
    }];
}

+(void) storeToParse:(NSDictionary *)inputDict
{
    __block PFObject *activeSession = [PFObject objectWithClassName:@"ActiveSessions"];
    NSString * callerID = [inputDict objectForKey:@"callerID"];
    if (callerID)
    {
        [activeSession setObject:callerID forKey:@"callerID"];
    }
    bool bAudio = [[inputDict objectForKey:@"isAudio"]boolValue];
    [activeSession setObject:[NSNumber numberWithBool:bAudio] forKey:@"isAudio"];

    bool bVideo = [[inputDict objectForKey:@"isVideo"]boolValue];
    [activeSession setObject:[NSNumber numberWithBool:bVideo] forKey:@"isVideo"];

    NSString * receiverID = [inputDict objectForKey:@"receiverID"];
    if (receiverID)
    {
        [activeSession setObject:receiverID forKey:@"receiverID"];
    }

    //callerTitle
    NSString * callerTitle = [inputDict objectForKey:@"callerTitle"];
    if (receiverID)
    {
        [activeSession setObject:callerTitle forKey:@"callerTitle"];
    }

    [activeSession saveInBackgroundWithBlock:^(BOOL succeeded, NSError* error)
    {
        if (!error)
        {
             NSLog(@"sessionID: %@, publisherToken: %@ , subscriberToken: %@", activeSession[@"sessionID"],activeSession[@"publisherToken"],
                   activeSession[@"subscriberToken"]);

             LSAppDelegate * appDelegate = [[UIApplication sharedApplication] delegate];
             appDelegate.sessionID = activeSession[@"sessionID"];
             appDelegate.subscriberToken = activeSession[@"subscriberToken"];
             appDelegate.publisherToken = activeSession[@"publisherToken"];
             appDelegate.callerTitle = activeSession[@"callerTitle"];
             [[NSNotificationCenter defaultCenter] postNotification:[NSNotification notificationWithName:kSessionSavedNotification object:nil]];
         }
         else
         {
             NSLog(@"savesession error!!! %@", [error localizedDescription]);
             NSString * msg = [NSString stringWithFormat:@"Failed to save outgoing call session. Please try again.  %@", [error localizedDescription]];
             [self showAlert:msg];
         }         
     }];
}

At the end of executing the above code, we should have a sessionID, publisherToken as well as subscriberToken in our Parse.com ActiveSessions table. Alright, but who will execute it? Lot of stuff still remain unanswered - for example, from where does all the argument values (receiverID, callerID) come from? We deliberately missed that part, because establishing the session was most important. The callerID, receiverID parameters that we used above are actually just the user IDs generated by Parse.com PFUser object. You can have your own way of registering and authenticating a user. In LiveSessions, we just store each user within ActiveUsers table, and only using a user title of his / her own choice. No emails, passwords or verification. And here is code that is responsible for it:

//ParseHelper.m
+(void) showUserTitlePrompt
{
    UIAlertView *userNameAlert = [[UIAlertView alloc] initWithTitle:@"LiveSessions" message:@"Enter your name:" delegate:self cancelButtonTitle:nil otherButtonTitles:@"OK", nil];
    userNameAlert.alertViewStyle = UIAlertViewStylePlainTextInput;
    userNameAlert.tag = kUIAlertViewTagUserName;
    [userNameAlert show];
}

+(void) anonymousLogin
{
    loggedInUser = [PFUser currentUser];
    if (loggedInUser)
    {
        [self showUserTitlePrompt];       
        return;
    }

    [PFAnonymousUtils logInWithBlock:^(PFUser *user, NSError *error)
     {
         if (error)
         {
             NSLog(@"Anonymous login failed.%@", [error localizedDescription]);
             NSString * msg = [NSString stringWithFormat:@"Failed to login anonymously. Please try again.  %@", [error localizedDescription]];
             [self showAlert:msg];
         }
         else
         {            
             loggedInUser = [PFUser user];
             loggedInUser = user;
             [self showUserTitlePrompt];
         }
     }];
}

What this does is simple: when the app launches, check for the locally stored Parse user ([PFUser currentUser]), and if one does not exist, perform anonymous login, which will create a PFUser object on Parse.com Users table. What is important to us is loggedInUser static object that we use to store currently logged on user. At the end of successful login, showUserTitlePrompt function prompts the user to enter a title of his / her choice.

Fine, but what happens when user enters it? Well, significant number things. For a start, here is how LiveSessions handles it:

//ParseHelper.m
+ (void)alertView:(UIAlertView *)alertView clickedButtonAtIndex:(NSInteger)buttonIndex
{
    if (kUIAlertViewTagUserName == alertView.tag)
    {
        //lets differe saving title till we have the location.
        //saveuserwithlocationtoparse will handle it.
        LSAppDelegate * appDelegate = [[UIApplication sharedApplication] delegate];
        appDelegate.userTitle = [[alertView textFieldAtIndex:0].text copy];
        appDelegate.bFullyLoggedIn = YES;

        //fire appdelegate timer
        [appDelegate fireListeningTimer];
        [[NSNotificationCenter defaultCenter] postNotification:[NSNotification notificationWithName:kLoggedInNotification object:nil]];
    }
    else if (kUIAlertViewTagIncomingCall == alertView.tag)
    {
        if (buttonIndex != [alertView cancelButtonIndex])   //accept the call
        {
            //accept the call
            [[NSNotificationCenter defaultCenter] postNotification:[NSNotification notificationWithName:kIncomingCallNotification object:nil]];
        }
        else
        {
            //user did not accept call, restart timer          
            //start polling for new call.
            [self setPollingTimer:YES];
        }
    }
}

Notice the part under tag kUIAlertViewTagUserName. This code tells LiveSessions that user is now fully logged in, along with an identification (title) of his / her choice. This title will be eventually stored into ActiveUsers table as userTitle, but with one more thing: user's current location. Yes, LiveSessions is a location-aware app. And to obtain user's location, ParseHelper.m posts a kLoggedInNotification notification to LSViewController. LSViewController has CLLocationManager code inside it which will track user's current location. At the end, once we have everything, the entire user (his title, user ID and location) are saved into ActiveUsers table.

Here is what goes inside LSViewController to obtain user's current location, and call to Parse wrapper for storing it to ActiveUsers table:

//LSViewController.m
//Called in response of kLoggedInNotification 
- (void) didLogin
{
   [self startUpdate];
}

#pragma location methods
//this will invoke locationManager to track user's current location
- (void)startUpdate
{
    if (locationManager)
    {
        [locationManager stopUpdatingLocation];
    }
    else
    {
        locationManager = [[CLLocationManager alloc] init];
        [locationManager setDelegate:self];
        [locationManager setDesiredAccuracy:kCLLocationAccuracyBestForNavigation];
        [locationManager setDistanceFilter:30.0];
    }

    [locationManager startUpdatingLocation];
}

//stop tracking location
- (void)stopUpdate
{
    if (locationManager)
    {
        [locationManager stopUpdatingLocation];
    }
}

//this will store finalized user location. 
//once done, it will save it in ActiveUsers row and then fetch nearer users to show in table.
- (void)locationManager:(CLLocationManager *)manager
    didUpdateToLocation:(CLLocation *)newLocation
           fromLocation:(CLLocation *)oldLocation
{  

    CLLocationDistance meters = [newLocation distanceFromLocation:oldLocation];
    //discard if inaccurate, or if user hasn't moved much.
    if (meters != -1 && meters < 50.0)
        return;

    NSLog(@"## Latitude  : %f", newLocation.coordinate.latitude);
    NSLog(@"## Longitude : %f", newLocation.coordinate.longitude);

    appDelegate.currentLocation = newLocation;

    //pause the updates, until didUserLocSaved is called
    //via kUserLocSavedNotification notification, to avoid multiple saves.
    [self stopUpdate];

    PFUser * thisUser = [ParseHelper loggedInUser] ;

    [ParseHelper saveUserWithLocationToParse:thisUser :[PFGeoPoint geoPointWithLocation:appDelegate.currentLocation]];
    [self fireNearUsersQuery:RANGE_IN_MILES :appDelegate.currentLocation.coordinate :YES];
}

The first unknown in above code so far is call to fireNearUsersQuery function,which serves front end. We will come to it later. The other unknown is saveUserWithLocationToParse function, which will fill the gaps left so far to complete the back end. It belongs to ParseHelper.m, and here it goes - there is nothing unusual about storing it, and it acts as our own little user repository. The generated user's object ID is stored for later use inside activeUserObjectID.

//ParseHelper.m
+ (void) saveUserWithLocationToParse:(PFUser*) user :(PFGeoPoint *) geopoint
{
    __block PFObject *activeUser;

    PFQuery *query = [PFQuery queryWithClassName:@"ActiveUsers"];
    [query whereKey:@"userID" equalTo:user.objectId];
    [query findObjectsInBackgroundWithBlock:^(NSArray *objects, NSError *error)
    {
        if (!error)
        {
            // if user is active user already, just update the entry
            // otherwise create it.
            if (objects.count == 0)
            {
                activeUser = [PFObject objectWithClassName:@"ActiveUsers"];
            }
            else
            {                
                activeUser = (PFObject *)[objects objectAtIndex:0];
            }
            LSAppDelegate * appDelegate = [[UIApplication sharedApplication] delegate];
            [activeUser setObject:user.objectId forKey:@"userID"];
            [activeUser setObject:geopoint forKey:@"userLocation"];
            [activeUser setObject:appDelegate.userTitle forKey:@"userTitle"];
            [activeUser saveInBackgroundWithBlock:^(BOOL succeeded, NSError *error)
            {
                if (error)
                {
                    NSString * errordesc = [NSString stringWithFormat:@"Save to ActiveUsers failed.%@", [error localizedDescription]];
                    [self showAlert:errordesc];
                    NSLog(@"%@", errordesc);
                }
                else
                {
                    NSLog(@"Save to ActiveUsers succeeded.");
                    activeUserObjectID = activeUser.objectId;

                    NSLog(@"%@", activeUserObjectID);
                }
                [[NSNotificationCenter defaultCenter] postNotification:[NSNotification notificationWithName:kUserLocSavedNotification object:nil]];
            }];
        }
        else
        {
            NSString * msg = [NSString stringWithFormat:@"Failed to save updated location. Please try again.  %@", [error localizedDescription]];
            [self showAlert:msg];
        }
    }];
}

The code so far ensured a user is saved inside ActiveUsers table. We also saw how he / she can initiate a video call to another user, by creating an ActiveSessions object. But whom does the user chat with?

We must also present a list of users to logged on user to chat with - equivalent of Yahoo/Skype friend’s list. Sending friend requests through email or any other means would be quite an overkill for our tutorial's scope. To keep things minimal, we don’t even ask our users to enter their email ID for registration.

Instead, we have chosen a unique way to test out video chat feature: show list of users who are geographically within specified radii - say 200 miles. Parse.com already has PFGeopoint related query mechanism which makes our task easier.

The other unknown in code above, fireNearUsersQuery goes as below, and it fills up the datasource for the LSViewController table view - an NSMutableArray made of dictionaries filled with user's titles:

//LSViewController.m
//this method polls for new users that gets added / removed from surrounding region.
//distanceinMiles - range in Miles
//bRefreshUI - whether to refresh table UI
//argCoord - location around which to execute the search.
-(void) fireNearUsersQuery : (CLLocationDistance) distanceinMiles :(CLLocationCoordinate2D)argCoord :(bool)bRefreshUI
{
    CGFloat miles = distanceinMiles;
    NSLog(@"fireNearUsersQuery %f",miles);

    PFQuery *query = [PFQuery queryWithClassName:@"ActiveUsers"];
    [query setLimit:1000];
    [query whereKey:@"userLocation"
       nearGeoPoint:
     [PFGeoPoint geoPointWithLatitude:argCoord.latitude longitude:argCoord.longitude] withinMiles:miles];    

    //delete all existing rows,first from front end, then from data source. 
    [m_userArray removeAllObjects];
    [m_userTableView reloadData];    

    [query findObjectsInBackgroundWithBlock:^(NSArray *objects, NSError *error)
    {
        if (!error)
        {
            for (PFObject *object in objects)
            {
                //if for this user, skip it.
                NSString *userID = [object valueForKey:@"userID"];
                NSString *currentuser = [ParseHelper loggedInUser].objectId;
                NSLog(@"%@",userID);
                NSLog(@"%@",currentuser);

                if ([userID isEqualToString:currentuser])
                {
                    NSLog(@"skipping - current user");
                    continue;
                }

                NSString *userTitle = [object valueForKey:@"userTitle"];

                NSMutableDictionary * dict = [NSMutableDictionary dictionary];
                [dict setObject:userID forKey:@"userID"];
                [dict setObject:userTitle forKey:@"userTitle"];

                // TODO: if reverse-geocoder is added, userLocation can be converted to
                // meaningful placemark info and user's address can be shown in table view.
                // [dict setObject:userTitle forKey:@"userLocation"];
                [m_userArray addObject:dict];
            }

            //when done, refresh the table view
            if (bRefreshUI)
            {
                [m_userTableView reloadData];
            }
        }
        else
        {
            NSLog(@"%@",[error description]);
        }
    }];
}

The result of fireNearUsersQuery call will be somewhat like below, where 3 nearby users (<200 miles radii) are visible for chat:

iPhone video chat

Inside LSViewController, the m_userTableView gets populated from m_userArray. Each row in the table view has a green Call button. When you tap that button, call is initiated for that user as the receiver ID. What call? The code we just covered to store the session: saveSessionToParse. Who calls it? Well, now it's time the video chat scene (LSStreamingViewController) takes charge.

Before proceeding, take a look at this activity flow - the big picture. You will come back to it quite often as you read on:

iPhone Video chat

Upon tapping of the green phone call button, a segue is performed to transition to LSStreamingViewController. Inside LSStreamingViewController, [ParsHelper saveSessionToParse] is called. Here is that part:

//LSViewController.m
- (void) startVideoChat:(id) sender
{
    UIButton * button = (UIButton *)sender;

    if (button.tag < 0) //out of bounds
    {
        [ParseHelper showAlert:@"User is no longer online."];
        return;
    }

    NSMutableDictionary * dict = [m_userArray objectAtIndex:button.tag];
    NSString * receiverID = [dict objectForKey:@"userID"];
    m_receiverID = [receiverID copy];
    [self goToStreamingVC];
}

- (void) goToStreamingVC
{
    //[self presentModalViewController:streamingVC animated:YES];
    //
    [self performSegueWithIdentifier:@"StreamingSegue" sender:self];
}

-(void) prepareForSegue:(UIStoryboardPopoverSegue *)segue sender:(id)sender
{
    if ([segue.identifier isEqualToString:@"StreamingSegue"])
    {     
        UINavigationController * navcontroller =  (UINavigationController *) segue.destinationViewController;        
        LSStreamingViewController * streamingVC =  (LSStreamingViewController *)navcontroller.topViewController;        
        streamingVC.callReceiverID = [m_receiverID copy];    
        if (bAudioOnly)
        {
            streamingVC.bAudio = YES;
            streamingVC.bVideo = NO;
        }
        else
        {
            streamingVC.bAudio = YES;
            streamingVC.bVideo = YES;
        }
    }
}

Once inside LSStreamingViewController:

//LSStreamingViewController.m
- (void) viewDidAppear:(BOOL)animated
{
    if (![self.callReceiverID isEqualToString:@""])
    {
        m_mode = streamingModeOutgoing; //generate session
        [self initOutGoingCall];
        //connect, publish/subscriber -> will be taken care by
        //sessionSaved observer handler.
    }
    else
    {
        m_mode = streamingModeIncoming; //connect, publish, subscribe
        m_connectionAttempts = 1;
        [self connectWithPublisherToken];
    }
}

- (void) initOutGoingCall
{
    NSMutableDictionary * inputDict = [NSMutableDictionary dictionary];
    [inputDict setObject:[ParseHelper loggedInUser].objectId forKey:@"callerID"];
    [inputDict setObject:appDelegate.userTitle forKey:@"callerTitle"];
    [inputDict setObject:self.callReceiverID forKey:@"receiverID"];
    [inputDict setObject:[NSNumber numberWithBool:self.bAudio] forKey:@"isAudio"];
    [inputDict setObject:[NSNumber numberWithBool:self.bVideo] forKey:@"isVideo"];
    m_connectionAttempts = 1;
    [ParseHelper saveSessionToParse:inputDict];
}

As a matter of its duty, LSStreamingViewController handles both outgoing and incoming calls. To differentiate the two, it uses receiver ID (self.callreceiverID): For outgoing calls, it has a value supplied from LSViewController (see the segue transition code). For incoming calls, there is no need for it so it is null or empty.

As soon as saveSessionToParse saves ActiveSessions object to Parse.com database, it notifies LSStreamingViewController so that sessionID, publisherToken and subscriberToken values from Opentok (that became available to app's delegate) can be usable to LSStreamingViewController. This notification (kSessionSavedNotification) is handled by sessionSaved like this:

//LSStreamingViewController.m
- (void) sessionSaved
{
    [self connectWithSubscriberToken];
}

In forthcoming section we will see how the above call makes video chat fully seamless between two users, without Parse intervention.

Huh..the mammoth has been laid to rest, but there is still life in it. We already covered session generation part. But how does the other user know about it? And when exactly Opentok takes the charge to start the exciting video?

Step 2 - Handle Incoming Call:

Handling of an incoming call is tricky bit. Let's list out the bare minimum necessities:

  • You need to poll the database for a session destined to you (logged on user) - that is - search for an ActiveSessions record where current user is listed as receiver.
  • You need to ensure that database is up-to-date once the call has been established - that is, remove the session row once sessionID and tokens have been read up into iPhone app
  • You also need to signal interruptions while a session is ON - that is, inform the caller gracefully that the receiver is busy on another call. For simplicity's sake, we aren't handling multi-user calls (conference) right now, although it can be handled quite easily.

Recall that in alertView:(UIAlertView *)alertView clickedButtonAtIndex, we saw a call to [appDelegate fireListeningTimer],and now it is time to expand it, because it accomplishes our first task of the three listed above: It fires a timer that continually polls Parse.com ActiveSessions table for calls destined to current user.

//LSAppDelegate.m
//this method will be called once logged in. It will poll parse ActiveSessions object
//for incoming calls.
-(void) fireListeningTimer
{
    if (self.appTimer && [self.appTimer isValid])
        return;

    self.appTimer = [NSTimer scheduledTimerWithTimeInterval:8.0
                                                     target:self
                                                   selector:@selector(onTick:)
                                                   userInfo:nil
                                                    repeats:YES];
    [ParseHelper setPollingTimer:YES];  
    NSLog(@"fired timer");
}

-(void)onTick:(NSTimer *)timer
{
    NSLog(@"OnTick");
    [ParseHelper pollParseForActiveSessions];  
}

As it is named, [ParseHelper pollParseForActiveSessions] will poll ActiveSessions table for sessions calling out to this user - that is, rows which have receiverID = currently logged on user's object ID.

//ParseHelper.m
//poll parse ActiveSessions object for incoming calls.
 +(void) pollParseForActiveSessions
 {
     __block PFObject *activeSession;

     if (!bPollingTimerOn)
         return;

     PFQuery *query = [PFQuery queryWithClassName:@"ActiveSessions"];

     NSString* currentUserID = [self loggedInUser].objectId;
     [query whereKey:@"receiverID" equalTo:currentUserID];  

     [query findObjectsInBackgroundWithBlock:^(NSArray *objects, NSError *error)
      {
          if (!error)
          {
              // if user is active user already, just update the entry
              // otherwise create it.
              LSAppDelegate * appDelegate = [[UIApplication sharedApplication] delegate];

              if (objects.count == 0)
              {

              }
              else
              {
                  activeSession = (PFObject *)[objects objectAtIndex:0];                 
                  appDelegate.sessionID = activeSession[@"sessionID"];
                  appDelegate.subscriberToken = activeSession[@"subscriberToken"];
                  appDelegate.publisherToken = activeSession[@"publisherToken"];
                  appDelegate.callerTitle = activeSession[@"callerTitle"];
                 // future use:
                  //appDelegate.bAudioCallOnly = !([activeSession[@"isVideo"] boolValue]);

                  //done with backend object, remove it.
                  [self setPollingTimer:NO];
                  [self deleteActiveSession];

                  NSString *msg = [NSString stringWithFormat:@"Incoming Call from %@, Accept?", appDelegate.callerTitle];                  
                  UIAlertView *incomingCallAlert = [[UIAlertView alloc] initWithTitle:@"LiveSessions" message:msg delegate:self cancelButtonTitle:@"No" otherButtonTitles:@"Yes", nil];                 
                  incomingCallAlert.tag = kUIAlertViewTagIncomingCall;
                  [incomingCallAlert show];                 
              }
          }
          else
          {
              NSString * msg = [NSString stringWithFormat:@"Failed to retrieve active session for incoming call. Please try again. %@", [error localizedDescription]];
              [self showAlert:msg];
          }
     }];
}

The method is quite self-explanatory - whenever it finds an ActiveSessions object, it just copies all the fields it needs - sessionID, publisherToken, and subscriberToken into app delegate's properties. Once done, it deletes it from Parse.com backend using [self deleteActiveSession] call. [self setPollingTimer:NO] is to keep things in sync: it ensures that timer doesn't fire up another polling query through pollParseForActiveSessions after an object has been found and deletion is in progress using [self deleteActiveSession].

Once the ActiveSession values are copied to App's delegate, more important stuff is waiting: user needs to be notified of an incoming call. incomingCallAlert performs this task, and here is the result:

 iPhone video chat

What's more important is incomingCallAlert's delegate, which we already visited in Step 1 - let's go over it again:

//ParseHelper.m
+ (void)alertView:(UIAlertView *)alertView clickedButtonAtIndex:(NSInteger)buttonIndex
{
    if (kUIAlertViewTagUserName == alertView.tag)
    {
        //lets differ saving title till we have the location.
        //saveuserwithlocationtoparse will handle it.
        LSAppDelegate * appDelegate = [[UIApplication sharedApplication] delegate];
        appDelegate.userTitle = [[alertView textFieldAtIndex:0].text copy];
        appDelegate.bFullyLoggedIn = YES;

        //fire appdelegate timer
        [appDelegate fireListeningTimer];
        [[NSNotificationCenter defaultCenter] postNotification:[NSNotification notificationWithName:kLoggedInNotification object:nil]];
    }
    else if (kUIAlertViewTagIncomingCall == alertView.tag)
    {
        if (buttonIndex != [alertView cancelButtonIndex])   //accept the call
        {
            //accept the call
            [[NSNotificationCenter defaultCenter] postNotification:[NSNotification notificationWithName:kIncomingCallNotification object:nil]];
        }
        else
        {
            //user did not accept call, restart timer 
            //start polling for new call.
            [self setPollingTimer:YES];
        }
    }
}

If user had not accepted the call, the polling timer flag is set and app starts to look for new incoming call session. If user rather decides to accept the call, kIncomingCallNotification is posted, and it is responsible for notifying LSViewController that a call has arrived. In fact, any view controller within your chat app should be able to receive this notification, so that you can handle the call irrespective of where you are in the app. Such notifications can be handled by having a UIViewController subclass from which all your view controllers can inherit. For simplicity, Livesessions only has one view controller that handles incoming call, and here is how:

//if and when a call arrives- 
(void) didCallArrive
{    
     //pass blank because call has arrived, no need for receiverID.
     m_receiverID = @"";
     [self goToStreamingVC];
}

didCallArrive fires in response to kIncomingCallNotification, and all it does it empty the m_receiverID to indicate that call is destined to self - an incoming call. This, as we already saw in prepareForSegue - is enough to signal LSStreamingViewController that call is supposed to be handled as incoming call - so no new ActiveSessions object need to be stored. All that is left is to utilize the sessionID and token values to connect to Tokbox streaming server.

So far, we discussed both cases - in both we obtained sessionID, publisherToken and subscriberToken into our app's delegate. We finally passed the control over to LSStreamingViewController. In case of an outgoing call, [LSStreamingViewController sessionSaved] function calls [LSStreamingViewController connectWithSubscriberToken]. In case of incoming call, as we already saw in [LSStreamingViewController viewDidAppear], a call is made to [LSStreamingViewController connectWithPublisherToken].

//LSStreamingViewController.m
- (void) connectWithPublisherToken
{
    NSLog(@"connectWithPublisherToken");
    [self doConnect:appDelegate.publisherToken :appDelegate.sessionID];
}

- (void) connectWithSubscriberToken
{
    NSLog(@"connectWithSubscriberToken");    
    [self doConnect:appDelegate.subscriberToken :appDelegate.sessionID];
}

- (void)doConnect : (NSString *) token :(NSString *) sessionID
{
    _session = [[OTSession alloc] initWithSessionId:sessionID
                                           delegate:self];
    [_session addObserver:self forKeyPath:@"connectionCount"
                  options:NSKeyValueObservingOptionNew
                  context:nil];
    [_session connectWithApiKey:kApiKey token:token];
}

The only difference between two of them is the token they use - and Opentok isn't quite clear about what changes if you use one token instead of the other (publisher token or subscriber token - as you remember it is generated from cloud code in beforeSave trigger). Irrespective of which one you use to connect to a session, it allows you to publish your stream (your camera feed) as well as subscribe to other user's stream.

Once [_session connectWithApiKey] call is made, Opentok takes over. All you need to remember is that your streaming view controller (LSStreamingViewController) must implement these protocols: OTSessionDelegate, OTSubscriberDelegate, OTPublisherDelegate. See the activity flow diagram up again - there, iOS app initiated actions are listed in yellow, and delegates are marked in green. These delegate functions are part of these three protocols that LSStreamingViewController must implement. As they are called by Opentok along the flow, you need to take various actions to make that enticing video available to your user.

Now it no longer matters whether you are a caller or a receiver as far as you implement necessary delegate methods from Opentok. The Broadcast tutorial from Opentok has all implementation details, and I have followed it bit by bit, apart from my own UI modifications. For example, if you choose to view your own stream as soon as session gets connected, following code accomplishes it:

//LSStreamingViewController.m
- (void)sessionDidConnect:(OTSession*)session
{ 
    NSLog(@"sessionDidConnect: %@", session.sessionId);
    NSLog(@"- connectionId: %@", session.connection.connectionId);
    NSLog(@"- creationTime: %@", session.connection.creationTime);
    [self.disconnectButton setHidden:NO];
    [self.view bringSubviewToFront:self.disconnectButton];

    self._statusLabel.text = @"Connected, waiting for stream...";  
    [self.view bringSubviewToFront:self._statusLabel];

    [self doPublish];
}

- (void)doPublish
{
    _publisher = [[OTPublisher alloc] initWithDelegate:self name:UIDevice.currentDevice.name];
    _publisher.publishAudio = self.bAudio;
    _publisher.publishVideo = self.bVideo;
    [_session publish:_publisher];

    //symmetry is beauty.
    float x = 5.0;
    float y = 5.0;
    float publisherWidth = 120.0;
    float publisherHeight = 120.0;

    [_publisher.view setFrame:CGRectMake(x,y,publisherWidth,publisherHeight)];
    [self.view addSubview:_publisher.view];
    [self.view bringSubviewToFront:self.disconnectButton];
    [self.view bringSubviewToFront:self._statusLabel];

    NSLog(@"%f-%f-%f-%f", _publisher.view.frame.origin.x, _publisher.view.frame.origin.y, _publisher.view.frame.size.width, _publisher.view.frame.size.height);

    _publisher.view.layer.cornerRadius = 10.0;
    _publisher.view.layer.masksToBounds = YES;
    _publisher.view.layer.borderWidth = 5.0;
    _publisher.view.layer.borderColor = [UIColor yellowColor].CGColor;
}

In the code above, [_session publish] call prompts the user to allow his / her own camera feed, and as soon as he / she allows it, LiveSessions start publishing the camera feed to Opentok streaming server. A crucial piece to remember here is the call to following:

_publisher.view setFrame

SDK is designed such that without this, you never get to see your own feed. And no, any indirect method (e.g. addSubView to a container view) to set the frame doesn't work. At the same time you can decorate your publisher view. For example, we have changed features like border color and corner radius.

Seeing the feed of the other user in the same session is somewhat that doesn't fall into any order. All you need to do is implement necessary delegates so that as soon as you start receiving that feed, you get an opportunity to configure it fully - like this:

//LSStreamingViewController.m
- (void)subscriberDidConnectToStream:(OTSubscriber*)subscriber
{
    NSLog(@"subscriberDidConnectToStream (%@)", subscriber.stream.connection.connectionId);

    float subscriberWidth = [[UIScreen mainScreen] bounds].size.width;
    float subscriberHeight = [[UIScreen mainScreen] bounds].size.height - self.navigationController.navigationBar.frame.size.height;

    NSLog(@"screenheight %f", [[UIScreen mainScreen] bounds].size.height);
    NSLog(@"navheight %f", self.navigationController.navigationBar.frame.size.height);

    //fill up entire screen except navbar.
    [subscriber.view setFrame:CGRectMake(0, 0, subscriberWidth, subscriberHeight)];

    [self.view addSubview:subscriber.view];
    self.disconnectButton.hidden = NO;

    if (_publisher)
    {
        [self.view bringSubviewToFront:_publisher.view];
        [self.view bringSubviewToFront:self.disconnectButton];
        [self.view bringSubviewToFront:self._statusLabel];
    }
    subscriber.view.layer.cornerRadius = 10.0;
    subscriber.view.layer.masksToBounds = YES;
    subscriber.view.layer.borderWidth = 5.0;
    subscriber.view.layer.borderColor = [UIColor lightGrayColor].CGColor;

    self._statusLabel.text = @"Connected and streaming...";
    [self.view bringSubviewToFront:self._statusLabel];
}

subscriberDidConnectToStream delegate allows you to configure your own subscriber view. Again, setFrame statement is crucial and if you don't include it or do it with wrong values, you may never get to see other user's feed - something that can break your (and your friends') heart! Again, you can do your own UI modifications such as reporting the current status (using _statusLabel) and decorating the subscriber view inside the same delegate.

There are plenty of other delegates that Opentok sdk provides that you can use to include various features to smarten your app.

For example, see how I chose to handle subscriber didFailWithError delegate:

//LSStreamingViewController.m
- (void)subscriber:(OTSubscriber *)subscriber didFailWithError:(OTError *)error
{
    NSLog(@"subscriber: %@ didFailWithError: ", subscriber.stream.streamId);
    NSLog(@"- code: %d", error.code);
    NSLog(@"- description: %@", error.localizedDescription);
    self._statusLabel.text = @"Error receiving video feed, disconnecting...";
    [self.view bringSubviewToFront:self._statusLabel];
    [self performSelector:@selector(doneStreaming:) withObject:nil afterDelay:5.0];
}

- (IBAction)doneStreaming:(id)sender
{
    [self disConnectAndGoBack];
}

- (void) disConnectAndGoBack
{
    [self doUnpublish];
    [self doDisconnect];
    self.disconnectButton.hidden = YES;
    [ParseHelper deleteActiveSession];

    //set the polling on.
    [ParseHelper setPollingTimer:YES];
    [self dismissModalViewControllerAnimated:YES];
}

There is much more inside LSStreamingViewController that needs little or no explanation for someone who knows UIKit well. So we proudly declare that the mammoth may have stopped breathing -you can see it yourself:

iPhone video chat

By the way, who is that insane soul screaming in the publisher view? LiveSessions isn't that smart - what it shows there is what your iPhone's camera sees!

And yeah, if you didn't notice, the sleeping beast is an elephant, not a mammoth. Mammoths existed only in ice age. So is this real?

Who knows, it's just virtual. But so are nerds trolling in chat rooms.

All we need now is to discard the remnants to smoother the flow of our iPhone video chat - let's do them in one go.

The Cleanup:

As a VOIP app, LiveSessions must either keep running in the background or do its part of cleanup as soon as it enters background. To preserve simplicity, I chose later. There are some rules laid by Apple to perform any task in background - be it trivial or not. Within LSAppDelegate.m, we cleanup our back end following those rules.

- (void)applicationDidEnterBackground:(UIApplication *)application
{    
    backgroundTask = [application beginBackgroundTaskWithExpirationHandler:^{

        // Clean up any unfinished task business by marking where you        
        // stopped or ending the task outright.        
        [application endBackgroundTask:backgroundTask];        
        backgroundTask = UIBackgroundTaskInvalid;        
    }];

    // Start the long-running task and return immediately.    
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0),
    ^{
        // Do the work associated with the task, preferably in chunks.
        [ParseHelper deleteActiveSession];
        [ParseHelper deleteActiveUser];
        [application endBackgroundTask:backgroundTask];        
        backgroundTask = UIBackgroundTaskInvalid;        
    });    
}

And we also wake up following those rules:

- (void)applicationWillEnterForeground:(UIApplication *)application
{
    // Called as part of the transition from the background to the inactive state; here you can undo many of the changes made on entering the background.
    self.bFullyLoggedIn = NO;
    [ParseHelper initData];
    [ParseHelper anonymousLogin];    
}

And here goes what we call in the above code:

+ (void) deleteActiveSession
{
    NSLog(@"deleteActiveSession");
    LSAppDelegate * appDelegate = [[UIApplication sharedApplication] delegate];
    NSString * activeSessionID = appDelegate.sessionID;

    if (!activeSessionID || [activeSessionID isEqualToString:@""])
        return;

    PFQuery *query = [PFQuery queryWithClassName:@"ActiveSessions"];
    [query whereKey:@"sessionID" equalTo:appDelegate.sessionID];

    [query getFirstObjectInBackgroundWithBlock:^(PFObject *object, NSError *error)
    {
        if (!object)
        {
            NSLog(@"No session exists.");     
        }
        else
        {
            // The find succeeded.
            NSLog(@"Successfully retrieved the object.");
            [object deleteInBackgroundWithBlock:^(BOOL succeeded, NSError *error)
            {
                if (succeeded && !error)
                {
                    NSLog(@"Session deleted from parse");                   
                }
                else
                {
                    //[self showAlert:[error description]];
                    NSLog(@"%@", [error description]);
                }
            }];
        }
    }];
}

+ (void) deleteActiveUser
{
    NSString * activeUserobjID = [self activeUserObjectID];
    if (!activeUserobjID || [activeUserobjID isEqualToString:@""])
        return;

    PFQuery *query = [PFQuery queryWithClassName:@"ActiveUsers"];
    [query whereKey:@"userID" equalTo:activeUserobjID];

    [query getFirstObjectInBackgroundWithBlock:^(PFObject *object, NSError *error)
    {
        if (!object)
        {
            NSLog(@"No such users exists.");
        }
        else
        {
            // The find succeeded.
            NSLog(@"Successfully retrieved the ActiveUser.");
            [object deleteInBackgroundWithBlock:^(BOOL succeeded, NSError *error)
             {
                 if (succeeded && !error)
                 {
                     NSLog(@"User deleted from parse");
                     activeUserObjectID = nil;
                 }
                 else
                 {
                     //[self showAlert:[error description]];
                      NSLog(@"%@", [error description]);
                 }
             }];
        }
    }];
}

+(void) initData
{
    if (!objectsUnderDeletionQueue)
        objectsUnderDeletionQueue = [NSMutableArray array];
}

+ (bool) isUnderDeletion : (id) argObjectID
{
    return [objectsUnderDeletionQueue containsObject:argObjectID];
}

Both delete functions do what is expected - they delete ActiveUsers and ActiveSessions object from Parse.com database. There is nothing unusual that they do. objectsUnderDeletion array is our way of keeping things in sync: when Parse.com is busy deleting stuff in background, it prevents our app from repeatedly firing delete commands.

Sign off and Giving Ins:

This tutorial and the code that comes with can serve as bare backbone to your next cutting edge messenger App. You can do your own customizations for Parse user management or UI layout and create your next killer iphone video chat app.

Keep chatting...

3 comments


Or enter your name and Email
  • AK amit kumar 2 years ago
    hello how to add data in ActiveSession and UserSession
  • N Nirav 4 years ago
    Hi Amit, Sorry to be late - I responded to you on your personal email. Thanks! -Nirav
  • AJ Amit Jain 4 years ago
    Hi, great tutorial. I was wondering if you are available for a freelance project to add text based chat to an iOS app?