hacking into the indian education system

12
Debarghya Das 42 debarghyadas.com Indian Institutes of Management Technology Education Public Education Hacking (general programming) CBSE ICSE Schools Computer Science The College and University Experience India Post Topics Promote 46201 people have viewed this post. On the Stepping Stone 714 Followers In the last several weeks, in a time wedged awkwardly between the end of my college finals and my summer internship, Iʹve been exposed to some information that left me quite flabbergasted. It started about a month ago. It was a very important day in the lives of two of my juniors and close friends, Sumit Shyamsukha and Ronak Shah. It was the day their board examination marks, the ICSE and the ISC respectively, were to be released. To put this in context to our nonIndian readers, the ICSE is a 10th grade national examination and the ISC is its 12th grade counterpart. I am not aware of the exact numbers, but I believe the board who holds these examinations, ambiguously named CISCE, is the second most popular board of education in India, after the CBSE. Around 150,000 students take the ICSE and around 65,000 take the ISC from in and around India. Wrought with the monumental anxiety that usually surrounds such a crucial result, Sumit had asked me whether I might be able to break into their system before Dday and quell his curiosity. I knew it was a long shot. On one hand, I wanted to impress someone who looks up to me and on the other, I thought "break into the ICSE? This isn't Hollywood you can't just hack into everything, kiddo". Putting the pessimism past me, I decided to take a look. Technicality Alert. The portions below are for an audience interested in the technical aspect of things. I will try my level best to simplify things, but in case I fail, feel free to skip the past the demarcated [technical] tag. [Technical] Handed just a URL and a task, I was unsure of how to proceed past a mostly blank screen with two textboxes in the center. One textbox was for School Code and the other was for the student ID. I typed in Sumitʹs ID hoping for a result. Not surprisingly, this got me nowhere. The results werenʹt out yet, and probably not uploaded. This wasnʹt going to be easy. I tried again, a little while after the results were released. At least I could salvage some of my respect and maybe scrape some results, I thought. I investigated further. Viewing the source of the invalid page revealed some very poorly written and badly styled javascript. The javascript wasnʹt separated away from the Hacking into the Indian Education System Debarghya Das 4520 votes by Devvrat Singh Rathore, Tushar Gupta, Nikhil Mitra, (more) More The April Fools' XKCD Alma Mater Challenge Winter. Page 1 of 12 Hacking into the Indian Education System - On the Stepping Stone - Quora 6/6/2013 http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Upload: kitanovic-nenad

Post on 25-May-2017

240 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Hacking Into the Indian Education System

Debarghya Das42 debarghyadas.com

Indian Institutes of ManagementTechnology EducationPublic EducationHacking (general programming)CBSE ICSE SchoolsComputer ScienceThe College and University ExperienceIndia

Post Topics • Promote

46201 people have viewed this post.

On the Stepping Stone 714 Followers

In the last several weeks, in a time wedged awkwardly between the end of my college finals and my summer internship, Iʹve been exposed to some information that left me quite flabbergasted. It started about a month ago. It was a very important day in the lives of two of my juniors and close friends, Sumit Shyamsukhaand Ronak Shah. It was the day their board examination marks, the ICSE and the ISC respectively, were to be released. To put this in context to our non­Indian readers, the ICSE is a 10th grade national examination and the ISC is its 12th grade counterpart. I am not aware of the exact numbers, but I believe the board who holds these examinations, ambiguously named CISCE, is the second most popular board of education in India, after the CBSE. Around 150,000 students take the ICSE and around 65,000 take the ISC from in and around India. 

Wrought with the monumental anxiety that usually surrounds such a crucial result, Sumit had asked me whether I might be able to break into their system before D­day and quell his curiosity. I knew it was a long shot. On one hand, I wanted to impress someone who looks up to me and on the other, I thought "break into the ICSE? This isn't Hollywood ­ you can't just hack into everything, kiddo". Putting the pessimism past me, I decided to take a look. 

Technicality Alert. The portions below are for an audience interested in the technical aspect of things. I will try my level best to simplify things, but in case I fail, feel free to skip the past the demarcated [technical] tag.

[Technical]

Handed just a URL and a task, I was unsure of how to proceed past a mostly blank screen with two text­boxes in the center. One textbox was for School Code and the other was for the student ID. I typed in Sumitʹs ID hoping for a result. Not surprisingly, this got me nowhere. The results werenʹt out yet, and probably not uploaded. This wasnʹt going to be easy.

I tried again, a little while after the results were released.At least I could salvage some of my respect and maybe scrape some results, I thought. I investigated further. Viewing the source of the invalid page revealed some very poorly written and badly styled javascript. The javascript wasnʹt separated away from the 

Hacking into the Indian Education SystemDebarghya Das4520 votes by Devvrat Singh Rathore, Tushar Gupta, Nikhil Mitra, (more)

More The April Fools' XKCD Alma Mater ChallengeWinter.

Page 1 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 2: Hacking Into the Indian Education System

HTML into its own JS file (as is usually done). Neither was it minified. It was some sloppy web­work. There were several points of interest in the code. 

The page source did not contain the HTML for a particular studentʹs results but had one layer of additional security ­ it fetched them via javascript. However, to my surprise, it didnʹt fetch them from a database or anything server­side ­ it merely fetched them from another un­indexed (aka invisible to search engines) URL. Hereʹs the code that they used:1234567891011

function changeIframe() { // Retrieve 2 parameters from the URL

var indexno2 = getQuerystring('indexno2'); var indexno4 = getQuerystring('indexno4');// Concatenate the parameter Strings

var rollno = indexno2 + indexno4;// Fetch the results from a link to an html file named with the String

// of the concatenated parameters

$("#result").html('<img src=\'http://drop.ndtv.com/ndtv/common/onemin/icse/lomakePOSTRequest("web/10th/10-1/"+rollno+".html",rollno);

}

I added the inline comments to explain the code. I was taken aback a little by this discovery. Besides a wafer thin layer of hiding the content from the HTML page, all they did was fetch it from another un­encrypted HTML page.

The function makePostRequest was pretty self explanatory 

but I had to read its implementation to be sure.12345678910111213141516

function makePOSTRequest(url, parameters) { // Constructing an AJAX request

$.ajax({

data: parameters, //Data sent in the requesterror: function(request,status,error) { //In case of error execute this

failoverPOSTRequest_2("web/10th/10-2/"+parameters+".html",parameters);

}, success: function(data,status,request) { // In case of success, display

$("#result").html(data);},

url: url,type: 'POST'

});}

Essentially, it did exactly what I thought ­ it went to another URL to fetch a students results ­ a URL to <students school code><students individual code>.html. Yes. That simple. In case 

it failed, it would call failoverPOSTRequest_2 which 

would make the request to a different server ­ the 2nd one. This chain continued on to about 4 servers.[/Technical]

Page 2 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 3: Hacking Into the Indian Education System

Acquiring the results of ICSE and ISC candidates looked extremely straightforward because the results page had no proper security mechanism whatsoever. Let me break down what the format of an ICSE or ISC roll number looks like:ICSE: T/[4 digit school ID]/[3 digit student ID]ISC: B/[4 digit school ID]/[3 digit student ID]The school ID and the student ID are in linear order. There were several slight intricacies ­ school IDs for the same school were different for the different examinations. Eventually, it seemed like ICSE school IDs ranged from 4001 to 5568 and ISC school IDs ranged from 9001 to 9793. Student IDs always started at 001 and continued incrementally until the last student of that school. My task was merely to write a program which found these ranges and retrieved and stored all the results on my computer.

[Technical]Technically put, I merely needed to write a script to iterate through the various school IDs, check the different servers, and start with a student ID of 1 yet have a way to detect when there were no more students for a given school. I had to retrieve the resultant html files and parse them to extract all the useful information ­ Name, Date of Birth, ID, School, Marks.

This was all pretty simple. I initially wrote a prototype of the result fetcher which worked on hardcoded inputs and fetched one school at a time. You had to provide Student ID ranges, the school ID as well as the server number you wanted to query. This method was obviously stupid, but aha! I was fetching some valuable data!

Gradually, I built on the automation, until the only thing I had to supply was a range of school IDs and it would fetch data from all students from all those schools from any matching server they found and neatly store the results in a comma separated value (csv) file. 

Because I was unaware of the school ID range at first, I simulated a simple Map­Reduce model and split the work amongst a bunch of my collegeʹs machines. With a few keyboard presses and several instances of these on several remote machines, before I knew it, I was mining data at incredible speeds![/Technical]

Several hours later, I had all the ISC and ICSE results on my very own computer, in a bunch of comma­separated value files. It was truly incredible. 26 megabytes of pure, magnificent data. An Excel file I couldnʹt scroll to the bottom of. Just for kicks, I Ctrl+Fʹd a few names I knew and what do you know? There they were. Line after line of names, subjects and numbers. It was truly 

Page 3 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 4: Hacking Into the Indian Education System

mesmerizing. 

Take some time out and imbibe the significance of what just happened. One person had just acquired the exam results for the whole country. Not only was this a violation of any and all forms of privacy associated with something as personal as your examination marks, but a mass divulsion of all sorts of personal information ­ names, date of birth and school. This was a privacy breach of the highest order ­ a technological blitzkrieg. When 114,000 Apple IDs were compromised (AT&T Web site exposes data of 114,000 iPad users ), it was a huge deal. This is information of around 200,000 people weʹre dealing with. This is my first motive behind writing this article ­ to demonstrate the vulnerability of information on the web in India. To demonstrate how few measures our education board takes to hide such sensitive information. Even without any such fancy programming, anybody can easily peek a glance at his friendʹs results by querying for any of the 150 or so roll numbers associated with his school ID. This utter negligence of privacy with regards to grades is something I find intolerable. Marks should belong to you and only you.

Being the nerd that I am, I was inching to get my hands on all this data and crunch some badass numbers. Ideas rushed through my mind ­ regression tests on correlation between English marks and Science, city­based statistics, a final conclusive revelation of the national "toppers" and a general look at the larger scheme of things. I delved into deeper analytics as soon as I could. THE RESULTS WERE SHOCKING.

I deal with ICSE purely, for the time being.Here are some standard results first ­ the mark distributions of the 5 most common ICSE subjects.

The English (ENG) Distribution

Highest: 97Lowest: 15Mean: 69.93

Page 4 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 5: Hacking Into the Indian Education System

Median: 69Total Candidates: 140681Standard Deviation: 13.8119 Comments: Huh? What kind of horrible graph is this? Where are all the numbers in between? Must be a calculation mistake, I thought. Otherwise, a bimodal distribution is quite interesting. The plunge at 80 is quite weird as well. Everyone takes English, so the high number of candidates is not surprising. Whatʹs shocking is that this graph just looks wrong.

The History, Civics and Geography (HCG) Distribution

Highest: 100Lowest: 12Mean:  68.9846Median: 72Total Candidates: 140672Standard Deviation: 17.6493Comments: Woah, things are starting to get weird now. What kind of mistake could I have made in my data retrieval that I just missed out certain values? Am I that careless? Also, whatʹs with that massive drop from 90 to 95? And those certain points where everything peaks? Strange. All in all it seems like the curve leans to the higher end, with the median > mean. The standard deviation is abnormally high as well. The high number of candidates is no surprise here either, as HCG is compulsory.

The Distribution of Computer Application (CTA) Marks

Page 5 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 6: Hacking Into the Indian Education System

Highest: 100Lowest: 2Mean:  80.8846Median: 83Total Candidates: 86557Standard Deviation: 13.9073Comments: Okay, itʹs not even funny anymore. Why am I getting jagged graphs? And also, when did everybody become so great with Computers? Ah, but wait! I remember ­ in CTA, 50% of your grade comes from your school teacher. Clearly, students are being granted free marks there. Assuming everybody just secured that 50, the mean in the actual CTA paper becomes a more realistic 62 as opposed to the 81 it stands at now. This graph is leaning way to the right. The jagged edges still stand out, and hey, they seem to be in similar places too.

The Distribution of Hindi (Hindi) Marks

Highest: 99Lowest: 15Mean:  80.732Median: 83Total Candidates:  97128

Standard Deviation: 11.3388Comments: I give up. I guess I wrote my whole program wrong. What is up with these graphs? Other than that, people seem exceptionally good in Hindi. This graph looks normal overall if it werenʹt for those gaps.

The Distribution of Science (SCI) Marks

Page 6 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 7: Hacking Into the Indian Education System

Highest: 100Lowest: 16Mean:  62.061Median: 60Total Candidates:  126254Standard Deviation: 17.4817Comments: I think I just died. Are you kidding me? Is that a distribution or a hedgehog? Or did my brother just scribble that? Weird jagged edges at equal intervals ­ this is weird. This is definitely not evidence of a statistical result.

[Update]I left out Math, stupidly, my personal favorite subject and definitely one of top 6 most popular ones in the ICSE. My apologies.

The Distribution of Mathematics (MAT) Marks

Highest: 100Lowest: 7Mean: 70.1274786076387Median: 76Total Candidates:  135913Standard Deviation: (not yet calculated)Comments: Yet again, the graph disappoints me. It seems to be somewhat bimodal, but none of these subject graphs seem to resemble anything statistical.[/Update]

Putting them all together:

Page 7 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 8: Hacking Into the Indian Education System

I feel like Scooby Doo found a clue. This is some shocking results ­ it seems that when you plot  all the distributions together, all the spikes seem to align and the same values are missing for all subjects. Could this be some weird conspiracy theory?

Enough is enough. I decided to take all the ICSE marks in all subjects and plot the distribution. Hopefully everything would even out. The results were shocking. This was the resultant graph:

Marks Received in any Subject vs. Quantity of Students who received itMy hypothesis was correct ­ the exact same numbers were missing from everybody's results in all subjects. 

Evidence:1. There were specific numbers, in no real pattern, that were 

missing for the distribution of the entire distribution of all subjects achieved by all students. And these missing numbers were regularly interspersed on the number line. For example, 81, 82, 84, 85, 87, 89, 91 and 93 were visibly missing. I repeat, no one in India had achieved these marks in the ICSE.

2. 32, 33 and 34 were visibly absent. This chain of 3 consecutive numbers is the longest chain of absent numbers. Coincidentally, 35 happens to be the pass mark.

3. From the 66 numbers between 35, the pass mark, and 100, the maximum possible, ONLY 33 WERE ATTAINED.

4. Let me repeat that, 50% of the possible pass marks were attainable in the ICSE. Hereʹs a complete list of unattained 

Page 8 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 9: Hacking Into the Indian Education System

marks ­36, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 56, 57, 59, 61, 63, 65, 67, 68, 70, 71, 73, 75, 77, 79, 81, 82, 84, 85, 87, 89, 91, 93. Yes, thatʹs 33 numbers!

Analysis and Inferences:1. Weʹve all heard about ʺgraceʺ marks. Itʹs always been a huge 

myth and an outcome, many believe, of constant religious faith. Well, itʹs myth no more. I infer that they gave away grace marks to everyone who got a 32 to 34 to turn it into a 35 or more. Clearly, marks are tampered.

2. 33 out of the 66 possible marks were not attained. Could this mean that all of these unattained marks were simply promoted to the next mark, making it unattainable? Possibly, but weʹll never know which ones are promoted or demoted and by how much? All we know are that the marks are tampered.

3. One of the most common critiques of my theory was this ­maybe there were questions with only 3 or 4 mark intervals in all subjects making certain marks mathematically unattainable. My counterargument? All numbers from 94 to 100 are attainable and have been attained. What does this mean? It means that increments of 1 to 6 are attainable. By extension, all numbers from 0 to 100 are achievable. Let me give you an example. If 99 and 98 were definitely achievable with deductions of 1 and 2 respectively, this means one of two cases ­ there is a question A worth 1 mark that made 99 occur, and a question B worth 2 maks that made 98 occur, which meant getting A and B both wrong would mean 97 could occur. Case 2 ­ Question A was worth 1 mark, and question B was worth 1 mark too. The 99 got A wrong, and the 98 got A and B wrong. By this logic, if 97 were not possible, it would mean that there is no other question of 1 mark in the examination or that nobody got a 2 point question wrong and question A or B. Itʹs slightly complicated, and I did a horrible job of explaining it, but if you think about it and reason about this with regards to the point division in how the ICSE is structured, youʹll realize itʹs almost impossible that all of the nearly 150,000 candidates in any subject just did not get a combination of questions wrong that added up to 7. Itʹs statistically impossible.

4. Even if you believe that the above proof is enough to ascertain that definitely some amount of mark tampering occurred, you canʹt guarantee that it was harmless and a small one or two point boost. Even if it were just a one or two point boost, mark tampering is unfair. Further, it is highly unlikely to assert the simpler solution ­that it was a uniform case of mark promotion at certain marks. We do not know if they lowered or increased peoples 

Page 9 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 10: Hacking Into the Indian Education System

by an insignificant number. They might have boosted somebody up by 5 for all we know.Further, there is no gaps in the distribution from 94 to 100. This could mean one of two things ­ the ideal, that none of these marks are tampered for a special reason, or two, these marks are also tampered unevenly but we donʹt see the results on the graph because there is no break in the distribution.

5. You might be thinking right now how much of a joke this article is. You might be rolling your eyes thinking ʺCome on, canʹt it just happen that certain marks werenʹt achieved?ʺ If youʹre a skeptic and still donʹt believe when I say the absence of a 93 is statistically impossible, read on. Oneʹs total ICSE score is broadly gauged by one of 3 metrics ­ Overall Average, Best 5 subjects, and Best 4 subjects, plus English. Statistics says that if you take enough samples of data, regardless of the distributon, it will average out into a Normal distribution. When I plot the distribution of these metric, voila! 

What was initially a jagged mess has all of a sudden become a refined slightly askew bell. Statistics magically transformed that jagged mess into a nice curve. It is the same statistical theory that says that it is not possible for that 93 and those 32 other numbers to be absent from the previous distribution.

6. I know, I know. There will always be haters. One might still be saying, ʺMaybe this one examination was a little strange, but Iʹm sure they do it right most of the time.ʺ I present to you, the ISC result, the class 12 version of the same exam offered by the same board:

Page 10 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 11: Hacking Into the Indian Education System

Shocked? In the ISC, the pass mark is 40. 24 of the 61 numbers between 40 and 100 are unattained. Thatʹs a little more than 39% of them all. Here, grace marks rage on. Everything from 35 onwards, and most things from 23 onward seem blindly promoted to a pass mark.

The ISC determines college admissions to a wide array of Indian colleges that impose strict cut­offs for admission. One mark can change your fate. In such an abhorrent system, even the simplest case of making a 93 a 92 could change a future. Adulteration of these marks is absolutely unacceptable.

In my opinion, there is not a shadow of doubt in my mind that the CICSE board is fraudulent and guilty of mark tampering. Whether they changed some results by plus or minus 1 or plus or minus 5 is irrelevant. Fact is, they changed some results. This is the second motive behind my writing of this article ­ to draw out this fact.

Disclaimer: The total number of candidates I parsed maybe slightly short of the complete actual number. I may have missed some corner cases when fetching the data, and there may have been absentees during the examination. Also, for detailed analysis, I removed everybody who didn't complete the requisite number of subjects in the ICSE, which is why the number is slightly less than it should be

Prologue: For those readers who are still not convinced, I'll be updating this post on request to include more of the ISC data.

For those of you who are interested, I recently also cracked the CBSE class XII security and have managed to attain most of the results there too. This hack presented a whole host of new technical challenges. Fetching 1 million results as opposed to several hundred thousand is an order of a magnitude increase and required me to scale up my retrieve and data processing methodology. It should make for a good technical read. The results are eye­opening, to say the least.

Relevant Links:• The results page ­ ICSE, ISC Results for X and XII Board

Examinations on NDTV

Updates:1. GitHub repo is now private. Sorry guys.2. Added Math Distribution3. Twitter / Search ­ icse4. I haven't read all of these so I don't know how accurate their

Page 11 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System

Page 12: Hacking Into the Indian Education System

Recommended Posts From Other BlogsIs There A Complaint Department?: The Story Of My Wedding Dress by Cyndi Perlman FinkLeading a better life: Why do we need confidence? by Oliver EmbertonLeading a better life: You will still be here tomorrow, but your dreams may not by Oliver EmbertonMiss Cranky Pants: My Immutable Truths - Part 1 by Cyndi Perlman Finktheseparticles: Why I Hate Going Out by Anonymous

About Privacy Terms • Login Sign Up

content is.I spoke to the Times Of India (linked below) and I would like to clarify whatʹs been written in the article. The article states "A 20­year­old Indian student from Cornell University hacked into the database ... " This is technically incorrect. I did no such thing. I did not illegally access any database system. All I did was access information that was available to any person who entered a number into the website could access. I simply mined the data and then analyzed it to reveal some interesting and disturbing trends"

Daily Mail: ʹThese marks have been tampered withʹ: Sensational claim of US student who hacked CISCE resultsOutlook India: http://blogs.outlookindia.com/default.aspx?ddm=10&pid=2988&eid=31Daily Bhaskar: College student hacks 10th, 12th board results,reveals system loopholesHindustan Times: College boy hacks ICSE, ISC results ­Hindustan TimesThe Times of India: Indian student in Cornell University hacks into ICSE, ISC database ­ The Times of IndiaFirstpost: College grad hacks ICSE, ISC results, shows up India’s cyber flaws

46201 Views156+ Comments • Report • 8:59 on Tue Jun 4 2013

Page 12 of 12Hacking into the Indian Education System - On the Stepping Stone - Quora

6/6/2013http://deedy.quora.com/Hacking-into-the-Indian-Education-System