1 00:00:00,160 --> 00:00:03,270 In this demonstration you're going to see how MD5 hashes 2 00:00:03,270 --> 00:00:05,770 are created for two slightly different inputs, 3 00:00:05,770 --> 00:00:08,600 creating a very large difference in their outputs. 4 00:00:08,600 --> 00:00:09,780 Lets get started. 5 00:00:09,780 --> 00:00:11,140 On my screen on the left, 6 00:00:11,140 --> 00:00:13,650 I have an online MD5 hash generator 7 00:00:13,650 --> 00:00:14,620 and on the right, 8 00:00:14,620 --> 00:00:17,870 I have a text file of the US Constitution. 9 00:00:17,870 --> 00:00:19,140 Now, what I'm going to do is 10 00:00:19,140 --> 00:00:21,470 I'm going to copy this entire file 11 00:00:21,470 --> 00:00:23,430 of the textfile of the US Constitution 12 00:00:23,430 --> 00:00:25,750 and place it into the box. 13 00:00:25,750 --> 00:00:28,450 And immediately what is going to end up happening, 14 00:00:28,450 --> 00:00:29,890 if I go up here to the top, 15 00:00:29,890 --> 00:00:31,950 I have the entire text here 16 00:00:31,950 --> 00:00:35,180 and we have this hash string down here at the bottom. 17 00:00:35,180 --> 00:00:37,390 Now, the hash string at the bottom is unique 18 00:00:37,390 --> 00:00:39,600 to that Constitution and the word 19 00:00:39,600 --> 00:00:41,000 and the order that those are in. 20 00:00:41,000 --> 00:00:44,250 If I change even one letter in the Constitution, 21 00:00:44,250 --> 00:00:46,490 that hash is going to be drastically changed 22 00:00:46,490 --> 00:00:48,340 and I'm going to demonstrate that to you right now. 23 00:00:48,340 --> 00:00:51,320 So if I go here, there's the word defence with a ce 24 00:00:51,320 --> 00:00:52,930 which is the British spelling, 25 00:00:52,930 --> 00:00:55,520 but as Americans, we don't spell it that way anymore. 26 00:00:55,520 --> 00:00:58,900 We spell it as d-e-f-e-n-s-e. 27 00:00:58,900 --> 00:01:00,500 So if I change the c to an s, 28 00:01:00,500 --> 00:01:02,860 watch that has value as it changes. 29 00:01:02,860 --> 00:01:06,950 I'm going to do that in 3, 2, 1, change. 30 00:01:06,950 --> 00:01:10,350 Notice a vast difference in that hash. 31 00:01:10,350 --> 00:01:12,680 That's why one letter makes all 32 00:01:12,680 --> 00:01:14,640 the difference when you're dealing with hashes. 33 00:01:14,640 --> 00:01:16,390 Because of the way that these are computed, 34 00:01:16,390 --> 00:01:19,880 they come in one way and they come out a different way 35 00:01:19,880 --> 00:01:21,800 and they come out the same way every time 36 00:01:21,800 --> 00:01:23,130 based on what's input. 37 00:01:23,130 --> 00:01:25,200 So let me show you another example of this. 38 00:01:25,200 --> 00:01:28,377 Let's go ahead and say I have the line 1234567890 39 00:01:30,120 --> 00:01:33,460 and I'm going to treat each line as a separate hash now. 40 00:01:33,460 --> 00:01:38,460 If I went and said let's do it as 0123456789, 41 00:01:38,830 --> 00:01:41,400 they're vastly different because I shifted the position. 42 00:01:41,400 --> 00:01:42,890 You probably expect that. 43 00:01:42,890 --> 00:01:47,570 But, what if I did 1234567890, 44 00:01:47,570 --> 00:01:48,860 they're both the same now, 45 00:01:48,860 --> 00:01:50,670 and I added a dot. 46 00:01:50,670 --> 00:01:53,500 Well, that's going to change the bottom one, drastically. 47 00:01:53,500 --> 00:01:55,200 If I go back, it's the same. 48 00:01:55,200 --> 00:01:57,750 If I add a space, it changes it again. 49 00:01:57,750 --> 00:02:00,100 Every time you add any differences, 50 00:02:00,100 --> 00:02:02,540 whether it's one bit or one character, 51 00:02:02,540 --> 00:02:05,650 it's going to drastically change that MD5 output. 52 00:02:05,650 --> 00:02:07,720 And that's why MD5s are considered 53 00:02:07,720 --> 00:02:09,310 a way to verify integrity. 54 00:02:09,310 --> 00:02:11,310 It verifies that the original input 55 00:02:11,310 --> 00:02:13,290 was never changed in transit 56 00:02:13,290 --> 00:02:16,480 because the hashes match, there could be no changes. 57 00:02:16,480 --> 00:02:19,960 Now, are there any cases where two different things 58 00:02:19,960 --> 00:02:21,650 will give you the same hash? 59 00:02:21,650 --> 00:02:23,810 Yes, this is called a collision. 60 00:02:23,810 --> 00:02:27,760 They do occur because an MD5 only uses 128 bits 61 00:02:27,760 --> 00:02:30,200 to represent that hash value. 62 00:02:30,200 --> 00:02:32,500 That is going to give us a limited number of choices. 63 00:02:32,500 --> 00:02:35,100 But there are unlimited number of inputs. 64 00:02:35,100 --> 00:02:37,620 So when you have two that give you the same hash value, 65 00:02:37,620 --> 00:02:39,190 that is called a collision 66 00:02:39,190 --> 00:02:40,540 and that is a bad thing. 67 00:02:40,540 --> 00:02:43,480 So, because MD5 only has 128 bits, 68 00:02:43,480 --> 00:02:45,250 it does have more collisions than 69 00:02:45,250 --> 00:02:50,250 something like SHA1 or SHA256 or SHA512. 70 00:02:50,310 --> 00:02:52,410 Because as you extend that space 71 00:02:52,410 --> 00:02:54,950 of what that unique hash value can be, 72 00:02:54,950 --> 00:02:56,300 you have less collisions 73 00:02:56,300 --> 00:02:58,990 and that's why most people have moved to SHA1, 74 00:02:58,990 --> 00:03:03,233 SHA256 or SHA512 as we've move forward into the future.