Created on 2010-11-10 by Surabhi Thorayintavida
Switzernet
Create Excel file with required data
This document will help you to understand the steps that need to be followed for verifying the total call duration and the total number of calls coming from Verizon to Switzernet at our side and the calls from Verizon to Switzernet at Verizon’s side. Also, training on the creation of invoice for Verizon is included.
The purpose of this training is to understand the various steps and prepare a validation document.
Our first step is to make a local copy of the required CDR files.
In this document file for a period of one year (2009-10-01 to 2010-09-30), 12months is retrieved. Open cygwin and go to the path where you want to place the files, for retrieving the Switzernet zip files and unzipping them use the below commands.
CDR Switzernet ID=32
1. Make a new directory
mkdir switzernet
2. Change the directory
cd Switzernet
3. Command to select the months October, November and December (10, 11, and 12). It is one line command without space
scp surabhi-thorayin@www.intarnetinc.com:/usr/home/porta-admin/statistics/1/vendors/32/2009-1[0,1,2]-*-Switzernet.csv.zip .
4. Command to select the months from January till September 2010 (01…09)
scp surabhi-thorayin@www.intarnetinc.com:/usr/home/porta-admin/statistics/1/vendors/32/2010-0*-Switzernet.csv.zip .
5. Make directory
mkdir zip
6. Unzip as the downloaded file will be zip files.
unzip \*.zip
7. Move all the zip files to zip directory.
mv *.zip zip/
8. Come back to the directory where the unzipped csv files are present.
cd ../
For retrieving the Inbound_Verizon zip files and unzipping them use the below command, the explanation is same as for Switzernet.
CDR Inbound Verizon ID=6336
mkdir inbound_verizon
cd inbound_verizon
scp surabhi-thorayin@www.intarnetinc.com:/var/customers/6336/2009-1[0,1,2]-*-Inbound_Verizon.csv.zip
scp surabhi-thorayin@www.intarnetinc.com:/var/customers/6336/2010-0*-Inbound_Verizon.csv.zip
mkdir zip
unzip \*.zip
mv *.zip zip/
cd ../
The result will be two folders Switzernet and Inbound_Verizon. The Switzernet folder will contain 365 (daily generated) CSV files and Inbound_verizon (monthly generated) will contain 12 CSV files.
Most of the CSV files that we have obtained will have more than 65000 lines, hence we cannot open them completely in an excel spreadsheet. Also, it is difficult to compute manually the total call duration in 365 excel files. Therefore, convert the raw files into managable files.
1. Switzernet: To do this open cygwin and go to the path were Switzernet folder is present and merge the files in Switzernet folder into one csv file, name the csv file as out1.csv
Inbound_Verizon: Go to the path were Inbound_Verizon folder is present and merge the files in Inbound_Verizon folder into one csv file, name the output file as out2.csv
2. Always type ‘ls’ command to make sure that you are in the right folder, ls will show you the list of all the files in the folder (for example in the case of Switzernet folder it will be 365 files).
cat *.csv > out1.csv
cat *.csv > out2.csv
The result of this merge step will be an additional file out1.csv and out2.csv each in the folders Switzernet and Inbound_Verizon (out2.csv). Move this files to a new folder (not under switzernet) and name them out-switzernet and out-verizon.
The entire columns in out1.csv and out2.csv are not needed. By using awk command retrieve the required columns. Also, 0: is added to the value in 6th column (call duration) of the Switzernet file (out1.csv) so as to read the time data correctly when we open the CSV file in an excel spreadsheet.
From out1.csv, the 1st column and the 6th column with a suffix 0: is retrieved.
From out2.csv, the 2nd Column and the 8th column is retrieved.
Switzernet(out1.csv):
awk -F\",\" '{print"\""$1"\",\”0:"$6”\””}' out1.csv > select1.csv
Inbound_verizon(out2.csv):
awk -F\",\" '{print""$1"\",\""$2"\",\""$8"\""}' out2.csv > select2.csv
The result of this column selection step will be an additional file select1.csv and select2.csv each in the two folders out-switzernet and out-verizon ( select2.csv)
The select1 and select2 are two huge CSV files, which need to be split into CSV files of 60500 lines each so that it can be opened completely in an excel spreadsheet.
The split command in UNIX is used to do this, before we start the split operation for better clarity we can create a new folder split under out-switzernet and move the file select1.csv to this location ../out-switzernet/split
For Inbound_Verizon copy the file select2.csv to the location ../out-verizon/split
Once this is done we can start the split operation
1. Switzernet: Go the folder …/out-switzernet/split containing the select1.csv file, and type the below command
split -l 60500 select1.csv
2. Inbound_Verizon: Go the folder ../out-verizon/split containing the select2.csv file, and type the below command
split -l 60500 select2.csv
The result will be around 30 files under …/out-switzernet/split and 15 files under
../out-verizon/split. The name of the split files will be xaa, xab etc. By default there will be no extension for the files, rename the two sets of files with an extension .csv.
The final result of split and rename step will be two sets of .csv files.
1. Open the file in excel sheet and save file as .xls
2. Switch on the filter by going to Data ŕ Filterŕ AutoFilter as in Figure 1
Figure 1 Enable the auto filter option
3. Remove the header and footer (From, 0:Charged Time
And TOTAL 0:14008:50)
The merged file out1.csv will contain the header and footer of the independent csv files, the split files will also contain unordered header and footer, the presence of these two lines can cause error while calculating the total.
To remove, choose Custom from the filter and give the criteria as in Figure 2 Show rows where: first column equals TOTAL or equals From and click ok.
Figure 2 Select the custom option from filter to filter out the header and footer
You will get the result as in Figure 3. Remove all these rows by selecting them and deleting them Figure 4
Figure 3 Filtered From and TOTAL Figure 4 Delete the filtered rows
4. Before you start computing the total call duration create a new excel file sum.xls where you can copy the result from each of the excel files.
5. Calculate the total. To find the total duration, use the formula D1=SUM(B:B)
The sum will be like 3044:42:43, place these three values in three different columns of the new file and name the three columns as Hour (A), Minute (B) and second (C) and compute the total call duration in minutes (D) as: D1=SUM(A1*60,B1,C1/60)
6. Continue the same procedure for rest of the files in split folder with name x**
7. Find the total call duration in minutes from sum.xls file.
Go to the folder containing all the downloaded files (in switzernet).Use the below command to retrieve the total from the files to a new file named sumgrep.csv
grep TOTAL *.csv > sumgrep.csv
The time will be in the format 140008:08 in column B of sumgrep.csv
1. Find the position of : in column B using the formula C2=SEARCH(":",B2)
2. Use the formula D2=LEN(B2) to find the length of the number in column B
3. Separate the minutes and seconds to two different columns. to take the minutes do E2=LEFT(B2,C2-1), to take the seconds value do F2=RIGHT(B2,D2-C2)
4. All the numbers are in string format to change it to numeric format add zero G2=E2+0 and H2=F2+0
5. Sum the minutes and seconds J2=SUM(G:G) and J3=SUM(H:H)
6. Find the total in minutes using: K2=J2+J3/60
Compare this total with the total you have calculated manually. Both should be the same.
The merged file out2.csv will contain the header and footer of each file. The number of lines after eliminating the header and footer will be equal to the number of calls. The total number of calls can be found by
1. Find the total number of lines of the out1.csv files by opening the files in notepad ++
2. Get the total number of actual downloaded csv file.
Total no of calls = Total number of lines – 2*(Actual number of CSV files)
Eg:
For Switzernet it is almost 365 files
Therefore, Total no of calls = 2024095 -2*365 = 2023365
To eliminate the internal calls:
1. Make a second sheet in one of the split excels (eg:xaa.xls) and define name as phone as in Figure 5
Figure 5 Give a name for the selected fileds in sheet 2
2. Use the formula C1 = COUNTIF(phone, LEFT(A1,7)) to find if the from number is in the list of our numbers. The result is zero if the number is external and is one if number is 1.
3. Now select all the rows contains external numbers, to do this use the filter and select 0 from drop down menu.
4. Find the total charged time (HH:MM:SS) and put the value in three different column (D1, E1, F1) as done before.
5. Calculate the total minutes by using the formula G1=SUM(D1*60,E1,F1/60). For each file calculate the sum and paste into a new excel. To copy the formula to your complete file double click on one filed where formula is present.
6. Sum the total call duration. Also you can find the number of calls ( this can be found by summing the count 1 (internal calls) in each file and copying the result each time to the new file and then find the Total internal calls and subtract the Total internal calls from Total no of calls found earlier). Figure 6
Figure 6 Filter row where count =1 (internal calls)
1. Open the file in excel sheet and save file as .xls
2. Switch on the filter by going to Data ŕ Filterŕ AutoFilter as in Figure 1Choose Custom from the filter and give the criteria:
Show rows where: first column equals TOTAL or equals Account and click ok.
Do the same steps as for Switzernet .
3. Before you start computing the total call duration create a new excel file sum.xls where you can copy the result of each excel file.
4. Calculate the total for each split file and copy the result to the new sum.xls file.
To find the total duration Figure 7, use the formula D1=SUM(C:C)
The sum will be in seconds, you can convert to minutes by dividing the total by 60. D2=D1/60
Figure 7 One of the split files of select2.csv
5. Continue the same procedure for rest of the excel files in split folder with name x** .
Find the total call duration in minutes from sum.xls file.
Go to the folder containing all the downloaded files (in inbound_verizon). Use the below command to retrieve the total from the files to a new file named sumgrep.csv
grep TOTAL *.csv > sumgrep.csv
Here you will get the result in a file named sumgrep.csv
We will have the TOTAL in seconds, minutes and seconds and also the cost. Take the TOTAL sum of call duration in seconds and divide by 60.
Compare this total with the total you have calculated manually. Both should be the same.
The merged file out2.csv will contain the header and footer of each file. The number of lines after eliminating the header and footer will be equal to the number of calls.
The total number of calls can be found by
1. Find the total number of lines of the out2.csv files by opening the files in notepad ++
2. Get the total number of actual downloaded csv file.
Total no of calls = Total number of lines – 2*(Actual number of CSV files)
1. Compare the total manually calculated and the total found using grep. (If same)
2. Compare it with the total call duration that we found for Switzernet (after eliminating the internal calls)
3. Compare the total already calculated in the csv file and the total of all the other lines excluding the line where the Total is present. To do this, go to the folder where only the actual downloaded csv files are present and use the below command.
Switzernet:
grep -v TOTAL *.csv | awk -F\",\" '{split($6,a,":"); min+=a[1]; sec+=a[2]} END{print min+int(sec/60) ":" sec%60}'
grep TOTAL *.csv | awk -F\",\" '{split($6,a,":"); min+=a[1]; sec+=a[2]} END{print min+int(sec/60) ":" sec%60}'
Inbound Invoice:
grep TOTAL *.csv | awk -F\",\" '{ SUM += $8} END{print int(SUM/60)}'
grep -v TOTAL *.csv | awk -F\",\" '{ SUM += $8} END{print int(SUM/60)}'
4. If the Total’s are approximately the same the invoice for verizon can be prepared (there might be a small difference which can be ignored)
For verizon there will be one (original downloaded) csv file corresponding to each month.
We will have two cases
- CSV file is having less than 65000 lines
- CSV file is having greater than 65000 lines
For the first case open the file in excel spreadsheet and save as xls (eg: 2009-11-30-2009-11-01-Inbound_Verizon.xls), open a new xls file name it invoice.xls and make the below column names:
MONTH, Number of calls, ChargedTime -Seconds, ChargedTime –Min,
Charged amount(CHF)
Compute the total charged time seconds and number of calls from the file and copy the value to invoice.xls
For the second case, split the file to 2.
For example if the file name is 2010-01-31-2010-01-01-Inbound_Verizon.csv
split 60500 2010-01-31-2010-01-01-Inbound_Verizon.csv
In most of the cases one month .csv file will be not more than 121’000 lines. Hence, the result after split will be two files xaa and xab. Rename them as xaa.csv and xab.csv
Open them in xls and save as xls files. Take the required data from two files as for 1st case. Sum the result of 2 split files and copy to invoice.xls
Each time you enter the value to invoice.xls
1. Find the ChargedTiime –Min. D2=C2/60 (apply the same formula to all the rows, round off to 1 digit after decimal)
2. Find the Charged amount (CHF) E2= D2* 0.005 CHF (round off to 2 digits after decimal)
Each time compare with the already calculated total in the csv file (sumgrep.csv file for inbound_verizon data will have the already computed total values for each month)
3. Find Number of calls for each month (equal to total number of rows I the excel excluding the header and footer of the table) and place the value in column B
4. Perform the above steps for all the months,
5. Find the TOTAL Charged amount (CHF) F2=E:E
6. Find 7.6% VAT of TOTAL Charged amount (CHF) F3= =F2*0.076
7. Total including the VAT F4=sum(F3,F2)
1. Use the invoice format doc
2. Modify the billing dates, amount and also update all the columns. Add an additional column Number of calls
Month |
Number of calls |
Call Duration, Minutes |
Charged Amount, CHF |
3. Print the documents and provide a Switzernet stamp and expedition stamp before sending.
2. Burn a CD which contain the actual CDR files, the excel files and scanned copy of invoice.
3. Name the CD and take a scan of the CD.
4. Post the invoice, CD with a cover in a hard cover as registered post.
5. Get the tracking number while posting
6. Seize invoice in index.xls
-seize the scanned copy of invoice, CD scan and post slip together in one row.
-another line with the zip file that you have burned in the CD
Create a validation document by selecting a random range of months (min 2 months, max 5 months between 2009-10-01 and 2010-09-30 and compute the call duration, number of calls and charged amount.
The validation document should contain
1. The screen shots of all the UNIX command used in each step.
2. A table containing the results of Switzernet (call duration, number of calls), before and after removing the internal calls.
3. A table containing the results of Verizon (call duration, number of calls)
4. A final table similar to the invoice containing the month, call duration, number of calls and charged amount for the months that you are doing the computation.
Upload the validation document on the training session web site, according to the guidelines.
* * *