1 00:00:00,530 --> 00:00:02,320 XML vulnerabilities. 2 00:00:02,320 --> 00:00:04,960 In this lesson, we're going to talk about XML, 3 00:00:04,960 --> 00:00:07,330 which is the extensible markup language. 4 00:00:07,330 --> 00:00:09,750 This is used by web applications for authentication 5 00:00:09,750 --> 00:00:11,520 and authorizations and for other types 6 00:00:11,520 --> 00:00:13,480 of data exchange and uploading. 7 00:00:13,480 --> 00:00:14,930 Now, here you might hear some people 8 00:00:14,930 --> 00:00:17,280 talk about this as XML vulnerabilities, 9 00:00:17,280 --> 00:00:20,670 XML exploitation, or even XML injection. 10 00:00:20,670 --> 00:00:23,820 Now, technically, XML injection isn't really accurate. 11 00:00:23,820 --> 00:00:26,000 It's more of an XML parsing vulnerability 12 00:00:26,000 --> 00:00:27,230 that people are exploiting. 13 00:00:27,230 --> 00:00:29,010 But on the exam, you may see it called 14 00:00:29,010 --> 00:00:30,760 any of those three things. 15 00:00:30,760 --> 00:00:32,530 Now, when we talk about XML data, 16 00:00:32,530 --> 00:00:35,180 the XML data itself has to be submitted 17 00:00:35,180 --> 00:00:38,520 from you to the server or from one server to another. 18 00:00:38,520 --> 00:00:40,300 And so, when you're dealing with XML data, 19 00:00:40,300 --> 00:00:41,540 you want to make sure that it's submitted 20 00:00:41,540 --> 00:00:43,700 with encryption or input validation. 21 00:00:43,700 --> 00:00:45,870 If you submit XML data without encryption 22 00:00:45,870 --> 00:00:47,530 or without input validation, 23 00:00:47,530 --> 00:00:49,250 it's going to be vulnerable to spoofing, 24 00:00:49,250 --> 00:00:52,030 request forgery, and injection of arbitrary code. 25 00:00:52,030 --> 00:00:53,880 So, we want to make sure we prevent that. 26 00:00:53,880 --> 00:00:56,220 Again, input validation is here to help, 27 00:00:56,220 --> 00:00:57,860 and so is encryption. 28 00:00:57,860 --> 00:01:00,720 Now, when you look at XML, it looks something like this. 29 00:01:00,720 --> 00:01:02,910 This is a basic XML that I set up 30 00:01:02,910 --> 00:01:04,460 just to show you the example of it. 31 00:01:04,460 --> 00:01:05,900 You'll notice the first line here. 32 00:01:05,900 --> 00:01:07,880 It has XML listed right in there. 33 00:01:07,880 --> 00:01:11,290 What version of XML and what type of encoding we're using. 34 00:01:11,290 --> 00:01:13,430 The second line, this is what we're defining. 35 00:01:13,430 --> 00:01:15,730 We're defining a question in this case. 36 00:01:15,730 --> 00:01:17,620 And then we have a couple of different fields 37 00:01:17,620 --> 00:01:19,380 inside this question type. 38 00:01:19,380 --> 00:01:20,900 For instance, I have the ID. 39 00:01:20,900 --> 00:01:25,340 In this case, I'm identifying it as CYSA-002-0001. 40 00:01:26,270 --> 00:01:28,500 It's the first question in the second version 41 00:01:28,500 --> 00:01:30,500 of the CySA+ exam. 42 00:01:30,500 --> 00:01:31,600 Then I have a title. 43 00:01:31,600 --> 00:01:33,530 Is this an XML vulnerability? 44 00:01:33,530 --> 00:01:34,600 And then I have a choice. 45 00:01:34,600 --> 00:01:36,880 Yes, and a second choice, no. 46 00:01:36,880 --> 00:01:39,330 You can define these with any kind of terms you want. 47 00:01:39,330 --> 00:01:42,220 This is just how I define the structure for my question. 48 00:01:42,220 --> 00:01:43,520 Because maybe I have a quiz app 49 00:01:43,520 --> 00:01:44,860 and it's going to read this information 50 00:01:44,860 --> 00:01:46,140 and display it to the screen 51 00:01:46,140 --> 00:01:47,570 so you can get different questions 52 00:01:47,570 --> 00:01:49,940 while you're practicing for your CySA+ exam. 53 00:01:49,940 --> 00:01:51,020 And then you'll see they all, 54 00:01:51,020 --> 00:01:53,720 just like HTML, have the slash at the end of them, 55 00:01:53,720 --> 00:01:56,690 closing out those brackets and closing out the XML. 56 00:01:56,690 --> 00:01:57,740 If you see something like this, 57 00:01:57,740 --> 00:02:00,720 I want you to recognize it as XML code. 58 00:02:00,720 --> 00:02:01,980 Now, the next thing we're going to talk about 59 00:02:01,980 --> 00:02:04,490 is some of the exploits that we can have with XML. 60 00:02:04,490 --> 00:02:05,720 Now, the first one we're going to talk about 61 00:02:05,720 --> 00:02:08,840 is an XML bomb or a billion laughs attack. 62 00:02:08,840 --> 00:02:10,520 Now, this is where they take XML 63 00:02:10,520 --> 00:02:13,240 and they use this encoding to encode those entities 64 00:02:13,240 --> 00:02:16,320 that I just showed you and expand them to exponential sizes, 65 00:02:16,320 --> 00:02:19,190 consuming memory on the host and potentially crashing it. 66 00:02:19,190 --> 00:02:20,640 So, what does this sound like to you? 67 00:02:20,640 --> 00:02:23,970 Well, it sounds like a bomb or a denial-of-service attack. 68 00:02:23,970 --> 00:02:26,080 If I can go forward and I can start consuming 69 00:02:26,080 --> 00:02:27,590 all these resources on your web server 70 00:02:27,590 --> 00:02:30,180 by uploading some kind of bad XML file, 71 00:02:30,180 --> 00:02:31,320 I can take you down. 72 00:02:31,320 --> 00:02:33,760 And that's what we're trying to do with an XML bomb. 73 00:02:33,760 --> 00:02:36,080 Now, the next one we're going to talk about is an XML 74 00:02:36,080 --> 00:02:38,840 external entity or XXE. 75 00:02:38,840 --> 00:02:40,930 Now, this is an attack that embeds a request 76 00:02:40,930 --> 00:02:42,490 for a local resource. 77 00:02:42,490 --> 00:02:46,070 Hmm, this sounds kind of like a file inclusion, doesn't it? 78 00:02:46,070 --> 00:02:47,980 Well, let's take a look at what this looks like. 79 00:02:47,980 --> 00:02:49,980 Well, if we have something like XML, 80 00:02:49,980 --> 00:02:51,240 it's going to look like this. 81 00:02:51,240 --> 00:02:52,360 And we're going to have xml, 82 00:02:52,360 --> 00:02:54,160 the version, and the encoding type. 83 00:02:54,160 --> 00:02:55,670 Then we have the document type. 84 00:02:55,670 --> 00:02:57,090 We're defining this as foo, 85 00:02:57,090 --> 00:03:00,330 which is just code or speak for some junk variable. 86 00:03:00,330 --> 00:03:02,360 Then, we have the element foo any, 87 00:03:02,360 --> 00:03:04,810 we have the entity xxe system, 88 00:03:04,810 --> 00:03:06,170 and then we have that file, 89 00:03:06,170 --> 00:03:10,940 file:///etc/shadow, and then we end this out 90 00:03:10,940 --> 00:03:12,620 with some kind of data type in XML, 91 00:03:12,620 --> 00:03:13,940 in this case, we're calling it foo 92 00:03:13,940 --> 00:03:16,130 instead of question, or title, or ID. 93 00:03:16,130 --> 00:03:17,750 So, what really are we looking for in here 94 00:03:17,750 --> 00:03:19,050 that's really looking bad? 95 00:03:19,050 --> 00:03:21,200 Well, that etc shadow file, right? 96 00:03:21,200 --> 00:03:24,500 By looking at that file:///etc/shadow, 97 00:03:24,500 --> 00:03:27,130 that tells me they're trying to do a file inclusion. 98 00:03:27,130 --> 00:03:28,970 And because we're doing it through XML, 99 00:03:28,970 --> 00:03:31,600 this is known as an XML external entity 100 00:03:31,600 --> 00:03:34,120 or XXE type of attack. 101 00:03:34,120 --> 00:03:37,050 Now, to prevent XML vulnerabilities from being exploited, 102 00:03:37,050 --> 00:03:38,780 what do you think you want to do? 103 00:03:38,780 --> 00:03:41,560 You want to use proper input validation, that's right. 104 00:03:41,560 --> 00:03:44,510 Input validation, input validation, input validation. 105 00:03:44,510 --> 00:03:47,320 We keep talking about it, but it's really that important. 106 00:03:47,320 --> 00:03:48,750 Are we sensing a theme here? 107 00:03:48,750 --> 00:03:50,920 If we validate the input from a user, 108 00:03:50,920 --> 00:03:52,530 whether it's a URL being inputted, 109 00:03:52,530 --> 00:03:53,920 a file being inputted, 110 00:03:53,920 --> 00:03:55,520 a field being entered on a website, 111 00:03:55,520 --> 00:03:58,140 we can prevent a lot of these security issues. 112 00:03:58,140 --> 00:04:00,270 So always remember, input validation, 113 00:04:00,270 --> 00:04:02,110 anytime the user is giving you something 114 00:04:02,110 --> 00:04:04,540 and that'll help prevent a lot of these different attacks. 115 00:04:04,540 --> 00:04:06,080 Now, for the exam, 116 00:04:06,080 --> 00:04:08,320 if you see something with XML written in it, 117 00:04:08,320 --> 00:04:10,970 and it is clearly XML, guess what? 118 00:04:10,970 --> 00:04:13,650 It's going to be an XML vulnerability that's being exploited. 119 00:04:13,650 --> 00:04:15,640 They might call this XML vulnerability. 120 00:04:15,640 --> 00:04:17,610 They might call it XML exploitation. 121 00:04:17,610 --> 00:04:19,410 They might call it XML injection. 122 00:04:19,410 --> 00:04:20,400 Whatever they're talking about, 123 00:04:20,400 --> 00:04:23,410 it's still an XML vulnerability that's being exploited here. 124 00:04:23,410 --> 00:04:25,460 Now, if you see anything that looks like the code 125 00:04:25,460 --> 00:04:28,410 in the format that I showed you in this lesson, guess what? 126 00:04:28,410 --> 00:04:30,020 It's probably XML. 127 00:04:30,020 --> 00:04:31,560 Now, the only tricky part with this 128 00:04:31,560 --> 00:04:35,080 is that XML code can often look a lot like HTML code, 129 00:04:35,080 --> 00:04:36,860 or it might look like JavaScript. 130 00:04:36,860 --> 00:04:39,020 The big difference is that when you're dealing with HTML 131 00:04:39,020 --> 00:04:40,860 or JavaScript, there are defined keywords 132 00:04:40,860 --> 00:04:42,650 for each of those bracketed entries. 133 00:04:42,650 --> 00:04:45,500 With XML, you can make those say anything you want, 134 00:04:45,500 --> 00:04:47,570 depending on how you're configuring your XML. 135 00:04:47,570 --> 00:04:51,070 So, just take a second to read the code and identify, 136 00:04:51,070 --> 00:04:52,640 does this look like HTML? 137 00:04:52,640 --> 00:04:54,420 Are they using something like font 138 00:04:54,420 --> 00:04:57,900 or image or href, that's HTML. 139 00:04:57,900 --> 00:04:59,380 If they're using something like question, 140 00:04:59,380 --> 00:05:03,170 or ID, or type, or element, or entity, that's XML. 141 00:05:03,170 --> 00:05:04,290 And so, this will help you figure out 142 00:05:04,290 --> 00:05:05,370 which one they're referring to, 143 00:05:05,370 --> 00:05:06,970 so you can get the right answer.