DBA’s location history v2

André Guimarães, 2012-08-23

Switzernet

DBA’s location history v2. 1

1.      Previous version.. 1

2.      Objectives. 1

3.      Problems found.. 2

4.      Current version.. 2

Tables, events and procedures in DB statistics. 2

Data retrieval 4

5.      Installation.. 4

6.      Deployment.. 6

7.      Future developing.. 6

8.      Resources. 6

 

 

This document describes a possible implementation of the new location history providing an easier way of retrieving seasonal migration of customers or detecting problems in servers. The additional fields added also allow detecting easier frauds using registered phones in the Astrads.

 

Previous version

 

Location history is a group of tables that describe the history of all customers’ registrations. When a customer registers in an Astrad, information is sent to a DBA (and then shared with the others) which fills the table location2_history. This table contains the date of first and last registrations as well as the total registrations made in this period for each combination of account, Astrad, IP address and port.

 

One of the limitations of the current used version is that each DBA has only information regarding the accounts registered in each of its slave Astrads. This means that it is not possible to have a global view of our network without connecting to each DBA.

 

Another limitation is that this table can grow very much and after some if a telephone reuses the same IP address and port the old information is updated giving a false impression that the phone continued to register for all that time. This means that we cannot have full confidence in the data present in this table.

 

Also, upon updating a DBA to a new version all information that is contained in the table is lost. This is due to a “DROP DATABASE” that is done when the replication is made.

 

Objectives

 

The objectives of this new version are to solve the aforementioned problems:

-          Keep always the full history of registrations during a configured time period;

-          Have a central location2_history with all the information of all customers;

-          Solve the problem of the statistics in case of reutilization of the same IP and port;

-          Provide an easy way of retrieving data concerning the registered customers;

-          Add additional fields to help track problems with customers or in the Astrads;

-          Improve the quality of information provided in case of fraud.

 

Problems found

 

In the first implementation attempts, the table structure was changed to have the new required fields and the data was being written for all servers. However due to the huge amount of information that started being generated each insert in the table was increasingly slower. In a few days the information was arriving quicker than the server could treat it and write it.

 

This forced us to have to tables one with “active” information and other with older information. By having two tables, the problem of rewriting information, in case of port reusage, was automatically solved as long as the period we consider an account to be active doesn’t permit the rotation of all ports in a router even in case of NAT problems.

 

Current version

 

The SQL code can be downloaded [here].

 

Instead of using database astrad, the new tables were created in database statistics to prevent deletion on migrations. Also even if the SQL is run multiple times the tables are not destroyed neither the data within.

 

The new database is composed of 5 tables: location2_history_active, location2_history_old, config, location2_hourly and location2_daily.

Tables, events and procedures in DB statistics

 

The table config defines how much time the data should be kept. For the moment we keep 370 days of data in the database.

 

CREATE TABLE IF NOT EXISTS `config` (

  `name` varchar(80) NOT NULL,

  `value` varchar(80) NOT NULL,

  `description` varchar(250) DEFAULT NULL,

  PRIMARY KEY (`name`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

INSERT INTO `config` VALUES ('log_history_days','370','Keep log registration for number of days specified.');

INSERT INTO `config` VALUES ('log_history_hourly','8880','Keep log registration of location2_hourly for number of hours specified.');

INSERT INTO `config` VALUES ('log_history_daily','370','Keep log registration of location2_daily for number of days specified.');

INSERT INTO `config` VALUES ('log_history_backup','7200','Backup data after the specified time in seconds.');

 

The table location2_history_active is where the active data is kept. The triggers executed in database astrad write in this table and insert a new line if the keys username, db, domain, ipaddress and port are unique. On insertion the location of the phone is written and the start and stop date are set to the moment of registration. A new field expires indicates how much time was set in the phone for expiration time. The new field useragent is also written.

 

If they the aforementioned fields aren’t unique, the stop is updated to the new registration time, the expire is recalculated and the count is incremented by one.

 

The format of this table is:

 

CREATE TABLE `location2_history_active` (

  `db` varchar(15) NOT NULL,

  `domain` varchar(15) NOT NULL DEFAULT '',

  `username` varchar(80) NOT NULL DEFAULT '',

  `ipaddress` varchar(15) NOT NULL DEFAULT '',

  `port` smallint(5) unsigned NOT NULL DEFAULT '0',

  `useragent` varchar(20) Default 'Unknown',

  `start` datetime DEFAULT NULL,

  `stop` datetime DEFAULT NULL,

  `expires` int(10) unsigned,

  `count` int(10) unsigned DEFAULT '1',

  PRIMARY KEY (`username`,`db`,`domain`,`ipaddress`,`port`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

All entries older than 'log_history_backup' that still exist in location2_history_active are moved to location2_history_old by the event Evt_Backup_location2_history_active. This event is run each 5 minutes.

 

location2_history_old has the same format as location2_history_active with an additional index field which is the key in this table. This way it is possible to have multiple lines with the same username, db, domain, ipaddress and port combination.

 

The lines from this table are removed after 'log_history_days' days by the event Evt_location2_history_old_Delete.

 

Each 30 minutes a list of all unique account numbers registered in that period are added to the table location2_hourly. This is done by event Evt_location2_hourly_Insert which calls procedure Proc_Insert_location2_hourly. The event Evt_location2_hourly_Delete deletes entries in this table older than 'log_history_hourly' hours.

 

The format of the table is:

 

CREATE TABLE IF NOT EXISTS `location2_hourly` (

  `username` varchar(50) NOT NULL,

  `period` datetime NOT NULL,

  PRIMARY KEY (`username`,`period`),

  INDEX USING BTREE (username),

  INDEX USING BTREE (period)

) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

A similar table is generated for the unique daily accounts. This data is kept in location2_daily and is obtained from location2_hourly. The table containing the daily statistics is filled by the event Evt_location2_daily_Insert which calls procedure Proc_Insert_location2_daily each day, filling the information of the previous day. The information is kept in this table for 'log_history_daily' days. Older information is removed by the event Evt_location2_daily_Delete.

Data retrieval

 

Some example views and procedures were created to ease the process of data retrieval.

 

missing_daily_customers_week: This view shows a list of all customers that were registered one day and not on the following day for the last week.

 

missing_total_daily_customers_week: Same as above but shows a count instead of a list of users.

 

new_daily_customers_week: This view shows a list of all customers that were not registered one day and are on the following day for the last week.

 

new_total_daily_customers_week: Same as above but shows a count instead of a list of users.

 

missing_weekly_customers_week: This view shows a list of all customers that were registered two weeks ago but not on the current week.

 

Get_missing_daily_customers(date): Procedure that shows a list of customers that were registered one day but not on the following day. It receives as argument the last day of the period we wish to compare.

 

Get_missing_weekly_customers(date): Procedure that shows a list of customers that were registered on one week but not on the following. It receives as argument the last day of the period we wish to compare.

 

Get_missing_monthly_customers(date): Procedure that shows a list of customers that were registered on one month but not on the following. It receives as argument the last day of the period we wish to compare.

Installation

 

To install this new code, the above SQL should be run. In addition in DB astrad the code marked should be added to the location2 triggers (the old code for location2 can be removed).

 

DELIMITER |

DROP TRIGGER IF EXISTS `Trg_Insert_location2`|

CREATE TRIGGER `Trg_Insert_location2` AFTER INSERT ON `location2` FOR EACH ROW BEGIN

DECLARE IPAD varchar(15) DEFAULT '';

SET IPAD=(SELECT value FROM config WHERE name='ipaddr');

IF NEW.ipaddr IS NOT NULL AND NEW.port IS NOT NULL AND NEW.expires IS NOT NULL AND NEW.dba=IPAD THEN

  INSERT INTO registration (username,dba,domain,ipaddr,port,expires,register,useragent,action) VALUES (NEW.username,NEW.dba,NEW.domain,NEW.ipaddr,NEW.port,NEW.expires,NEW.register,NEW.useragent,'REGISTER');

END IF;

CALL add_multiple_ua(NEW.username,NEW.domain,NEW.ipaddr,NEW.port,NULL,NULL,NULL,NULL,"ADD");

INSERT INTO `statistics`.location2_history_active (db,domain,username,ipaddress,port,useragent,start,stop,expires) VALUES (NEW.dba,NEW.domain,NEW.username,NEW.ipaddr,NEW.port,NEW.useragent,NEW.register,NEW.register,UNIX_TIMESTAMP(NEW.expires)-UNIX_TIMESTAMP(NEW.register)) ON DUPLICATE KEY UPDATE count=count+1, stop=NEW.register, expires=UNIX_TIMESTAMP(NEW.expires)-UNIX_TIMESTAMP(NEW.register);

END |

 

DROP TRIGGER IF EXISTS `Trg_Update_location2`|

CREATE TRIGGER `Trg_Update_location2` AFTER UPDATE ON `location2` FOR EACH ROW BEGIN

DECLARE IPAD varchar(15) DEFAULT '';

SET IPAD=(SELECT value FROM config WHERE name='ipaddr');

IF NEW.ipaddr IS NOT NULL AND NEW.port IS NOT NULL AND NEW.expires IS NOT NULL AND NEW.dba=IPAD THEN

  IF OLD.username != NEW.username OR OLD.dba != NEW.dba OR OLD.domain != NEW.domain OR OLD.ipaddr != NEW.ipaddr OR OLD.port != NEW.port THEN

    INSERT INTO registration (username,dba,domain,ipaddr,port,expires,register,useragent,action) VALUES (OLD.username,OLD.dba,OLD.domain,OLD.ipaddr,OLD.port,OLD.expires,OLD.register,OLD.useragent,'UNREGISTER');

  END IF;

  INSERT INTO registration (username,dba,domain,ipaddr,port,expires,register,useragent,action) VALUES (NEW.username,NEW.dba,NEW.domain,NEW.ipaddr,NEW.port,NEW.expires,NEW.register,NEW.useragent,'REGISTER');

END IF;

CALL add_multiple_ua(NEW.username,NEW.domain,NEW.ipaddr,NEW.port,OLD.username,OLD.domain,OLD.ipaddr,OLD.port,"UPD");

INSERT INTO `statistics`.location2_history_active (db,domain,username,ipaddress,port,useragent,start,stop,expires) VALUES (NEW.dba,NEW.domain,NEW.username,NEW.ipaddr,NEW.port,NEW.useragent,NEW.register,NEW.register,UNIX_TIMESTAMP(NEW.expires)-UNIX_TIMESTAMP(NEW.register)) ON DUPLICATE KEY UPDATE count=count+1, stop=NEW.register, expires=UNIX_TIMESTAMP(NEW.expires)-UNIX_TIMESTAMP(NEW.register);

END |

DELIMITER ;

Deployment

 

At the time of writing of this document this code is in production for 3 months in just one of the DBAs. This DBA doesn’t have usually registered customer and is used in case of problems with another DBA or migrations.

 

No problems were detected with the code for this working period. It is not getting noticeable slower even after this time.

 

The information has helped in providing additional information in some of the frauds. It also helped in detecting cases of NAT problems.

 

During this time location2_history_old grew to 470K lines, location2_daily to 388K (92 days) and location2_hourly 17.3M (4377 half hours).

 

Currently the database is configured to keep data for a full year and consider the registration data expired after 2 hours.

 

Future developing

 

It is recommended that this code is installed in one or several monitor servers. This way it will not affect the charge of the DBAs when generating reports. In the monitoring servers, it would be also possible to generate graphs with information regarding for instance the weekly gain or loss of customers.

 

New tables might be needed to optimize the current generation of reports. Instead of executing each time one of the statistics query, the results could be saved in tables and then the result could be selected.

 

We could backup to a file old data instead of deleting it. This way we could easily use it again if needed.

Resources

 

DBA versions (Astrad Master/Billing Slave server)

http://switzernet.com/3/public/110317-db3-versions/

 

Astrad/DBA schema

http://switzernet.com/3/public/111019-astrad-dba-schema/

 

Astrad versions

http://switzernet.com/3/public/110126-astrad-versions/

 

Server monitoring

http://portasip-monitor.switzernet.com/110726-monitor

 

Internal locking in MySQL

http://dev.mysql.com/doc/refman/5.0/en/internal-locking.html