false
Catalog
MehtA+ Secrets & Signals: Finance and Fraud – Adve ...
Recording Seminar
Recording Seminar
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Congratulations, math kangaroo winners, fantastic job, Mr. Bagheer, if you'd like to introduce yourself. Yeah. Hi, everyone. I'm Bagheer. Oh, there's an echo. Oh, I can try to mute myself, but I don't hear that echo, but yeah. Okay. So, I received my bachelor's and master's in electrical engineering and computer science all in four years from Stanford back in 2022. And I'm the co-founder of MethaPlus along with my sister. In the past, I've interned at Amazon and Microsoft, and now I'm working as a software engineer at a startup that spun out of a Stanford lab right here in Silicon Valley. And I, like all of you, was a math kangaroo participant. And I placed first place internationally twice and top 10 nationally several times. Hi, I'm Ms. Haripriya. I did my bachelor's and master's in electrical engineering and computer science from MIT. I am co-founder of MethaPlus, which we'll talk about later on in our presentation. I'm also a software engineer at Microsoft, and like you, I have placed top 10 nationally at math kangaroo. Oh, I'm so sorry. I'm so sorry. I have a call. One second. Let me just, let me just take it. Hello? Oh, Dr. Roo, you again? I'm in the middle of a presentation. You send a video to the kids? Oh, you were delivering a package, like a congratulations gift to the kids. And oh, oh, okay. But, but you're in some trouble now. I see. Okay. So, we have a video. Mr. Burgess, if you'd like to share? Yes. If you can give me a second. Okay. So, I'm in the middle of a presentation. I'm in the middle of a presentation. Can people see this? Yes. Okay, great. We can't hear any of the noises. Can you share your sound? Yes. Stop sharing. It's on the side, I think. Zoom has changed it. Yes, thank you. I can't hear anything. I can't hear anything. Dr. Ru, please come with us. Who are you? The United States Postal Inspection Service. Few know about our existence, but we take the job of protecting the mail very seriously. Yes, I am familiar. The oldest continuously operating federal police agency in the world. What? Cat got your tongue? Despite your small numbers, you prevent tons of crime throughout the U.S. I'm assuming you need my help with the case? Yes. We were sorting through the mail when we noticed a mysterious letter with a rip from which a USB drive dropped out, addressed to a Pete the Peacock. When I plugged it in, I saw a single image on the drive. A postcard. Of course, Cargo. In Poland, what a beautiful city. Yes, I've been there myself. Only, it looks like this picture has been tampered with. This is what the image is supposed to look like. We believe someone is trying to send some secret code through the mail. Can you please help us figure out what it says? I'm sorry, there's I think a mismatch between the sound and the video, which is the way Zoom works, but hopefully you got all of that. And make sure you keep on listening, because we are going to have questions and stuff on this. Can you see my screen now? Yes. You can go ahead, Mr. McGeer. Great. So yeah, you can see the original image over here, and if you can go to the next slide, please. This is the image with some sort of, we believe, a hidden message embedded. So you probably didn't notice any, you know, differences between the two slides. So if you go to the next slide, please. If you do a 32x zoom on the images, you can see maybe very subtle differences. So maybe some of you can see within these red circles. If you look at some of the colors and compare them between the two images, they're a little washed out in the right hand side, right? Like maybe the red isn't as bright, for example, for the woman holding the purse in the bottom middle of the image. So hopefully you guys can see some of the differences, and you can see why we think that there's something wrong with this image on the right hand side. So let's figure out what that is. But before we go into that, let's figure out, you know, why are the colors off? So we need to know how do computers store colors, right? And so, well, first of all, computers use the binary system to store numbers, right? Instead of the decimal system. So that means that each numeral used is a 0 or 1. So instead of ranging from 0 through 9, like we normally do in the decimal system, and also each numeral used is called a bit instead of a digit. So one of the most common schemes used for coloring is RGB, where shades of three different wavelengths of light, red, green, and blue, are combined in various intensities to produce a color. So you can see in the diagram on the right hand side, like there's, you know, different combinations of blue, red, and green can produce different colors. And so a computer image is made up of pixels, and a color is stored at each pixel. And the intensity of each wavelength that the color is comprised of, red, green, and blue, is represented by an 8-bit number, like 01010110. So, for example, like black we can say is comprised of the three wavelengths of red with an intensity of 0, green intensity of 0, and blue intensity of 0. And white, similarly, it's made up of the three wavelengths red, green, and blue, but each with the intensity of 255. But remember, computers use binary. So we can't just say that, you know, the red, green, and blue values are 0 or 255. We have to convert these numbers from decimal into binary. And here's some of the numbers from the right hand side chart. You know, in decimal, we have the number 0. In binary, we represent it as eight 0s. 128 is represented as 1, followed by seven 0s. And 255 is represented as eight 1s. So, yeah, you can see on the right hand side chart, maybe some of the most common colors and what they look like in RGB. Yeah. Okay. Next slide, please. But some of you might be wondering, okay, how do you know that 128 in decimal converts to 1 followed by seven 0s in binary? So, the same way 128 is basically the sum of the powers of 10, right? It's a 1 times 100 plus 2 times 10 plus 8 times 1, right, which is 10 to the power of 0. We can also express 128 as a sum of powers of 2. So it turns out that actually 128 itself is a power of 2. So if we express it as a sum of powers of 2, we get 1 times 2 to the power of 7 plus 0 times 2 raised to the power of 6 plus 0 times 2 raised to the power of 5 and so on, you know, all the other numbers are 0. And so it's represented as 1 followed by seven 0s. Does that roughly make sense to everyone? I know some of you may be familiar with number bases. We're not going too deep into number bases, but they are important to understand the theory behind how we embed images and other images. Some of you are asking about hexadecimal. So hexadecimal is base 16. That's also an alternate way to represent color. But for today's class, we're only going to focus mostly on binary and, of course, decimal because the numbers we work with in math class are always decimal. Yeah, so we have a question. Is it always like the sum of 2 to the power of something? Yes. So the same way decimal is always the sum of powers of 10, binary is the sum of powers of 2. I see a hand raised. I'm not really sure how to call on them. So if you want to type your question in the chat, let us know. But we can move on. Feel free to type your questions up in the chat. OK, so if you want to represent each of the colors and store them on the computer, you do it by writing down the triplet of the red, green and blue intensities. So orange is represented as 255, 128, 0 right inside of decimal. But in binary, that's eight ones followed by one one and seven zeros and then eight zeros. Yeah. So that brings us to our first question. My sister will bring up the answers in just a bit. But remember, you can't change your answer once it's selected. So please be sure you're comfortable with your answer before you click submit. OK, so here's the question. Each unique triplet for an RGB color represents a unique color and consists of three eight bit numbers representing the red, green and blue intensities. So, for example, you know, eight zeros followed by one followed by one. That refers to the color aqua. And remember, each bit must be either zero or one. So how many unique RGB colors are there? Please don't type your answer in the chat because you're giving it away to other people. There is a competition with a small prize, you know, that you guys are all competing for. I mean, you're free to type it in the chat if, for example, you're for whatever reason you can't submit in the poll. But please try to submit in the poll because it's hard for us to gather the answers in chat. Yeah. And some of these questions are super easy. So I really don't want to give more than one minute. Once you this first question, I'll give you one minute for some of the other easier ones. If I see most of the people have, you know, answered the question, then I might give you 30 seconds. But for this first one, you might have to figure out where the poll feature is. So I'll let you figure it out. And so while you guys are thinking about that, I'll answer some questions in the chat. What happened to yellow, red, blue, yellow, red, blue is used in our class, perhaps. But, you know, when you're mixing maybe like watercolors together. But the way the human eye works, we have red, green and blue cones. And similarly, computers also have red, green and blue LEDs. So you can form colors in both ways. It's just a question of like what components are you using to generate a color? Yeah. That's right. I see four more participants. I know I'm a little over the one minute mark. But if you guys can please fill in your answers. You have three, two, one. OK, I'm ending poll. OK. So the answer is. Yeah, the correct answer is C. And then next slide, please. So here's one easy way to maybe think about the problem. So imagine that instead of three separate eight bit numbers, we just join the three numbers together and represent each color as a single 24 bit number. Right. So if we have eight zeros, eight ones, eight ones, that becomes instead like just one giant number with eight zeros followed by 16 ones. Right. So we have a single 24 bit number. OK, next slide, please. Now the question becomes then how many different 24 bit numbers are there? Right. Because each different 24 bit number corresponds to a unique, you know, triplet of eight bit numbers, which each correspond to a unique color. Right. So if we want to figure out how many one bit numbers there are, the answer would be simple. There's two, there's zero or one. How many two bit numbers are there? There are two possible values for each bit. So two times two, which is four. So, you know, we have zero, zero, zero, one, one, zero, one, one. Next slide, please. So how many three bit numbers are there? You know, much the same way you follow the same pattern. There's two times two times two possibilities for each bit. So that gives us eight total unique combinations of possibilities. And so how many 24 bit numbers are there? It's two times two times two, you know, 24 times. Or in other words, two raised to the power of 24. OK, before we go on. Oh, can you hear me? Sorry, I don't know. OK, yeah. Before we go on to the next question, some of you may have put 255 to the third power. Someone put in the chat, it would be 256 to the third power, right? Because zero is inclusive. It goes from zero to 255. So that's 256 numbers. The competition is for the correct answer, not necessarily the fastest answer. But, of course, you should answer the question before I close the poll. And I cannot see you. Someone add the question for that. And yes, your AP intro to CS and robotics classes are going to help you today. Both Mr. Bagith and I are software engineers. And a lot of these concepts are things that are covered in computer science and electrical engineering classes at the university level. So with that, question number two. So now, don't use your calculators for this, please. Just try to figure out if you can approximate how much is two raised to the power of 24. Just in your mind. So here's a hint. Two raised to the power of 24 is equal to two raised to the power of 20 times two to the power of four. Or two to the power of 10 squared times two to the power of four. And consider that two raised to the power of 10 is 1024, which is close to a power of 10. So is it A, 48? B, 4 million? C, 8 million? D, 16 million? For this one, I'll only give 45 seconds. We are at the 25 second mark. Make sure to answer in the poll and not in the chat. Five more seconds. Five, four, three, two, one, and poll. And the answer is D. So how do we get there? Well, it's easy if we break the number down into the product of multiple exponents that are easier to calculate or approximate. So we have two raised to the power of 24. And we saw that that's equal to two raised to the power of 10 times two raised to the power of 10 times two raised to the power of four. Or 1024 times 1024 times 16. Now, this might be challenging to do in your head. But remember, 1024 is about 1000. And this is actually a common trick that computer scientists use to approximate powers of two. So 1000 times 1000 is just 1 million, right? And 1 million times 16 is 16 million. OK, so why is this number three? Now, humans can distinguish between 1 million colors due to 6 million cones being in each eye. Can humans differentiate between all RGB colors that can be created with three 8-bit numbers? Which, you know, there's again, like we said, there's about 16 million of those. And or also the second part of this question is, can humans distinguish between all RGB colors that can be created with three 7-bit numbers? So the answers are A, yes, yes. B, yes, no. C, no, yes. D, no, no. Since there are two questions embedded in this question, I will give a minute. Although some people were very quick and have already responded, which is awesome. There are some questions about like is this hex or RGB. Those are actually pretty much equivalent colors teams is just an RGB or usually use decibel numbers in hex use hexadecimal numbers here we're using binary because ultimately hex is just a more compact version of binary. But in the, you know, computers it's everything is ultimately stored as binary. If you combine four of the binary digits binary bits together, that gives you a single hexadecimal hex it. So, but that would introduce yet another base. Three more seconds 3210. Yes. Okay. So the answer is D, no and no. Hopefully you all see why but let's explain why. So, we said earlier, right, there's about 16 million RGB colors that can be created with eight bit numbers 16 million is greater than 1 million. So humans cannot distinguish between all RGB colors that can be made with three eight digit numbers right there's just more colors that exist out there on computers, then our eyes can distinguish between. Okay. Next slide please. Now, to answer the second part of the question we have to figure out how many unique seven bit triplets are there. So remember we said that there are about two, there are exactly two raised to the power of 24 RGB colors with eight bits for each of the color elements. So that gives us about 16 million colors. So, similarly, a seven bit triplet can also be thought of, you know, as a single 21 bit number as we, you know, we were talking earlier like eight bit triplets would can be thought of as a 24 bit number seven bit triplets can be thought of as a single 21 bit number. So in the same way you can determine that there's two raised to the power of 21 RGB colors. Next slide please. And two raised to the power of 21 is just two raised to the power of 10 times two raised to the power of 10 times two, right, which is about 1000 times 1000 times two so about 2 million. Alternatively, you can also observe that two raised to the power of 21 and two raised to the power of 24 are different by a factor of two raised to the power of three, which is eight. So, basically, this is two raised to the power of 28 would be an eighth of 16 million or 2 million. Right, so 2 million is greater than 1 million also so humans cannot distinguish between all RGB colors that can be made with three seven digit numbers. Now here's just a quick question for the chat. Do you think humans can distinguish between all RGB colors that can be made with three six digit numbers. Probably right in the sense that, you know, there's definitely fewer colors than we can distinguish between but it could be still that you know some of the colors are still too fine to distinguish between right and there may be some other colors that we can recognize that computers can represent. Okay, next. I got some questions about quantum computing that a lot of that gets complicated because you are not really based to because you're 01 or you can be zero and one so we won't get into that but yeah it's an interesting field for sure. And yes, I do not have a preference, Mr. Begir, I don't know if you have a preference on do decimal or are base 12 or 20. You know, some people say do a decimal is interesting, it would be interesting to like make that, you know, the base that we learned all our numbers in because so many numbers are factors of 12 and that's base 12. Yeah, but I think decimal is good enough for us right we have 10 digits right why because we have 10 fingers, fingers and toes are also known as digits right. So, but baseball is pretty cool base 16 I think can be nicer it is compact, but that complicates things and then you start to have letters and that's just going on. So, you know, right now we're sticking to RGB and we're expressing everything in base two. We're not even getting to more complex color schemes like CMYK or things like that. And someone else is asking an interesting question do the two eyes of the human not make it 2 million. Well, your two eyes are basically I mean they should be virtually identical right so you can see 1 million colors on left eye but you can see basically the same colors on the right eye and so that's 1 million colors total. The reason why humans have two eyes or animals tend to have two eyes is for depth perception it's not for increased, you know, color range. But interesting question. Okay. So our next question. How many RGB colors can be created if we use two bits for every number in the RGB triplet instead of eight bits. So in other words we have three two bit numbers. And please remember, answer your question in the poll right a 6b 12c 64d 529. And given how fast people are responding I think I'll just give 30 seconds for this one. Someone asked why are there four quizzes in a row someone who's attended our lectures before. We like to mix things up keep you on your toes, so that anyone who's new to it isn't at a disadvantage. Someone likes seximal more which is basic. So basically whatever you're used to using that that is the most convenient. There's no real like this is the faster way of computing things and zeros and ones in the digital world that's easy to store and so that's why everything's in base two. Base 16 is just again like it's a power of two so it's easy to convert between the two. I will end the poll in 321 I went to a minute. But, okay. So, the answer is c, 64. Okay. We can see why. Next slide please. So yeah, here's an example of what an RGB color looks like when you only have two bits for each number right so it could be like something like 00, 01 and 11 for the RG and B components. So imagine that instead of three separate two bit numbers we just joined the three numbers together and represent each color as a single six bit number right and remember this is the same logic as we've been applying all throughout. Recall earlier how we determined there were two raised to the power of 24 different 24 bit numbers and two raised to the power of 21 different 21 bit numbers. So in the same way we can see that there are two raised to the power of six equals to 64 different six bit numbers. Yeah. Alternatively, you can also see that for a single two bit number there are four options 00, 01, 10 and 11. So for three two bit numbers there are four times four times four equals to 64 possible different combinations. Yeah. And yeah, some people are guessing maybe based off just what are the other choices but I would recommend that you know you just try to get the actual correct answer because if there's a free response otherwise it'd be tricky for you. So, yeah. Next slide please. So, we learned a lot about colors, a lot of interesting things about how the human eye works, how these base to you know numbers are represented in computers. Let's talk about steganography. Okay, steganography is the art of hiding information within another object often in plain sight. So sometimes you might embed like text or image for example within another image. Next slide please. So, if you remember our original image and the image with the what we believe has a hidden message, right, we can see there's very subtle changes to the pixel color values, but those subtle changes could be enough to embed another image inside of the first. Next slide please. So how is this possible? Let's take a look at this beautiful image on the left. This is another city in Poland, Gdansk. So instead of using eight bit RGB triplets let's use six bit RGB triplets to view it. So, next slide please. This is with six bit RGB triplets, right. Do you notice the difference? Probably not, right. So, based off our earlier calculations we said there are about 16 million eight bit colors, 2 million seven bit colors. So then that would give us about 256k six bit colors, right, 256,000. And you know the human eye can distinguish between 1 million colors, but we'd have to really zoom in to notice the pixel differences at the, you know, the color differences at the pixel value. So, next slide please. Also someone asked if women are better at distinguishing the color. Indeed, women are better at distinguishing subtle distinctions in color, but men appear more sensitive to objects moving across the field of vision. Someone else had a question of what does MetaPlus have to do? Is it related to Facebook Meta? We had this name first. It's a play of words on our last name Meta, but A plus because of our A plus teaching. And yeah, so it's our organization name, but we'll talk more about that later. How many quizzes in total will there be? That is a surprise. We want to make sure that you are on your toes. And any other questions? What does 23 square have to do with this question? That was one of the answer choices. It was just, it's a, well, it's an incorrect answer. So, nothing. Okay, so it's difficult to notice, you know, such minute differences without zooming in, even if we can tell the different colors apart. Again, you'd have to be at the pixel level and then be comparing pixel by pixel both images to see the color differences. Next slide please. Now, even though we're using only six bits for each element in the RGB triplet, the computer still needs to store an 8-bit number, right? Because there's an 8-bit slot in the computer for each number. So it just uses zeros to fill in the last two bits. So in other words, that means in each 8-bit triplet, we have extra room for a 2-bit triplet to fit in because those last two bits aren't really being utilized. Next slide please. Now let's take a new image that's the exact same size as our previous image. What happens if we were to represent each color with 2-bit RGB triplets? Right now that's 8-bit RGB triplets, but let's convert it to 2-bit RGB triplets. Next slide please. So we've converted the image. We can see there's definitely quite a difference from the original picture, given how rich the colors were, but this image is still pretty detailed, given that our palette now only has 64 colors. And we can still make out what the image is supposed to be. Okay, next slide please. So now let's fill in the empty two bits in each element of every RGB triplet of our first image with the bits from the second image. So this image on the left, that's exactly what we've done. Okay, next slide please. So you can compare the original image, that original image with the 8-bit colors, and now this new image with the second image embedded in the first, side by side. Do you guys see any difference between the two images? No, right? Not much, at least certainly not at this resolution, you know, this is zoomed out. Okay, next slide please. Yeah, so ultimately what we're doing is taking the first six bits for each color element from the Gnansk image and the first two bits from the MissPaws image and we're combining it together to get an 8-bit number. Okay, next slide please. So each color element value will either stay the same or be off by up to two, which is a virtually imperceptible difference. Next slide please. So then the question is, how do we extract the hidden message? If this is the encoding technique used to embed a second image within this image, how do we extract this hidden message? Any ideas in the chat? Yeah, someone asked, doesn't this mean that you can use the last two bits to take up the message? Exactly. So let's just extract the last two bits from each pixel and extract the hidden message. Next slide please. Okay, I think this is a video, so we're going pixel by pixel trying to extract the image and... Next slide please. Is there a new slide? Yeah, so this is the image we recovered. It says, hello Pete the Peacock, buy 100,000 shares of Nefarious Inc. at $19 today at 2pm. Anyone know what this is? Someone gives you a stock tip and tells you to buy a certain stock at a certain time, maybe even a certain price. It's illegal, exactly. This is insider trading. Beautiful, broad. Okay, I'm glad you guys know something's wrong. Okay, next slide, please. Okay, go ahead. Oh, no, no, so go ahead. I was just gonna ask if you have a video link, maybe I can try sharing this time, just because I think there's a lag on your side. It looks like a clear case. Sure, go ahead, yeah. Can you share me the link? Yes. Looks like a clear... Just one second. Looks like a clear case of insider trading, but who on the inside of the company can be feeding Pete the peacock information? There may be some clues inside of the company's accounting statements. Let's head over to the SEC to take a look. You got it. I'll take a look at the website to see if I can find any clues as to Pete the peacock's trading strategies. Okay, people are asking, can we follow the advice on the postcard to become super rich? Nefarious link exists within the stock trading exchange that Dr. Roo and his friends live in. It exists in their world, maybe not in ours. So I would not go around investing in Nefarious Link. There is no stock trading advice being given here today, but we are giving advice on how you can detect fraud. So let's say we're going through some accounting statements and can we see if there's anything fishy in here? Next slide, please. So first we need to figure out what are some techniques to figure out why something might look fraudulent or how can we detect that there might be some fraud going on? So there's one interesting way to do it, which is using Benford's law. So Benford's law is an observation that Simon Newcomb first made in 1881. He realized that continuous quantitative data tends to follow a certain pattern. Next slide, please. And that pattern is that the distribution of data looks something like this. So if you were to take all the numbers, maybe let's say you're looking at the heights of all the buildings in New York City, and you were to lay out all the numbers and you lay the heights out in feet, right? If you look at the initial digit of all those numbers, about 30% of them would start with a one, about like 17% of them maybe would start with a two, about 12% would start with a three. And you can see how this distribution kind of follows. It's this actually inverse logarithmic distribution. So if you go next slide, please. Yeah, so roughly speaking, the probability that a number will start with a digit D is the log of one plus one over D with base 10. Okay, next slide, please. So a high-level explanation as to why we numbers tend to follow this distribution. So let's focus on all stocks in the stock market that are two digits, right? So something that's anywhere ranging from like $10 to $99. So for a stock price to get to $20 plus, it must first exceed $19, right? So you kind of have to get over that hump of 10 to $19 in order to be higher than that. So now the higher the stock price, the less likely it is to be reached, right? It's harder to reach higher values, makes sense. So that's, it'll be much more common for stocks that exceed the $10 barrier to be about 10 to $19 than it will for them to be 90 to $99. Does that make sense? Yeah, and then like you could do the same thing for like all three digit numbers and then all four digit numbers and yeah, so on and so forth. So let's take a random distribution of data like the prices of all stocks in the stock market. Usually we need hundreds of data points just to see this a lot work, but for this example, we'll look at 20. Okay, next slide, please. So here's an example of 20 stocks, right? I've laid out the prices in increasing order. Next slide, please. And so if you look at our distribution of how frequently we see a number starting with some digit, right? Next slide, please. And let's compare this to Benford's law. You can see they look relatively similar, right? And that's with just 20 data points. If we had more data points, it would probably look smoother. Interesting questions. Does the specific distribution only work for building heights or everything? Yeah, so it can be tricky to know where to apply this correctly, right? So if the distribution is large enough and so you have at least like, let's say 500 data points, 1000 data points, you can apply it. And also there needs to be quite a range for the actual data points themselves, right? So like humans don't usually get to be about 10 feet tall. I think the tallest person on record is like eight feet, 11 inches. So you can't really do this with like human heights because there's a very finite short range for what human heights can be. But you can do this with like income levels or people's salaries, you know, for example, or like the number of inhabitants in cities across America. So yeah. Next slide, please. Yeah, and this is made up data. This is made up data, yes. Just as an example. Yes, yeah. And five, six, seven, eight, nine all have the same frequency. Does this mean anything? It means that like one stock started with the price or had a price that started with five, one stock had a price that started with six. That's all that means, right? Because basically just that figure that I showed you, that's how we generated that distribution on the left. If we had a lot more data points, hundreds of data points, you'd probably see a distribution that's similar to Benford's law. If you were to actually do this on the like, you know, stock market today, you would definitely see some distribution like Benford's law. Someone's saying it looks kind of like an exponential curve. Kind of sure. But, you know, think about the direction that exponential curve is generally growing, right? You know, you're increasing exponentially your Y value as you move towards the right on the X axis, right? This is the opposite direction of growth and it's logarithmic. Okay, cool. Next slide, please. Yeah, so this is what Benford's law, if you were to like actually calculate the probabilities of starting with each digit, this is what it looks like. These are the probabilities. So just, yeah, give that a second to sink in, but you can see how, you know, the numbers decrease as you go on. Yeah, okay. Next slide, please. So in 1972, a man named Hal Varian suggested using Benford's law to detect accounting fraud because people generating fake numbers tend to distribute them uniformly. Next slide, please. So note, you know, a distribution not matching Benford's law is not enough to prove fraud, but it can be enough proof to start an investigation. And it has been proven in the court of law. Like it's been taken as like, hey, this is valid proof just to start off an investigation. If you can also back it up with like other concrete statements of, you know, this is the actual crime that was committed. Okay, next slide, please. Okay, so that brings us to our fifth quiz question. Which of these distributions is most closely following Benford's law? And, you know, remember what the distribution looks like, you know, what the equation was. If you want a calculator, you could, you know, bring it up. Yeah, so there's four distributions, A, B, C, and D. Think about which of these is most similar to Benford's law. And I'll just type in the equation in the chat so people can see it. Give you another, Mr. Brigitte has typed the equation in the chat. I'll give you maybe 15 more seconds. Five, four. So the equation, the X represents the actual digit. Yeah. Okay, everyone has answered. Okay, so the answer is A. The correct answer is A. Actually, if you can stay on this slide, Mr. Brigitte, so I can show the pictures, yeah. So hopefully it's obvious to you guys that C and D are incorrect, right? Because the numbers are actually increasing as the digit increases, the frequency is increasing. That's obviously not following Benford's law. So it's between A and B. Now, if you look at A, there seems to be a bit of noise, right? It's a bit jagged. You see a little bit of, you know, increase from four to five or six, seven, and eight. But overall, the numbers are still relatively small and approximately what they should be. Whereas if you look at B, the percentage of times that you're seeing one is like over about 50%, right? And two, it's about 25%. Three is like 12.5%, four, 6.25%. So you can see that this isn't actually quite Benford's law, right? This is halving each time, but this is a much more skewed distribution compared to Benford's law. So the answer is A. And probably if we had more data points, whatever distribution A is referring to, like whatever distribution data, they would probably smooth out and resemble Benford's law more closely. Okay, next slide, please. Yeah, okay. So D, actually, that answer was the distribution of Nefarious Inc's expenses over the past several years. So there's clearly something fishy going on, right? We have a lot of numbers ending with nine, not so many numbers, or sorry, starting with nine and not so many numbers starting with one through eight. Okay, next slide, please. So investigating further, it seems that most of the expenses are $999.99, just below federal reporting limits. If you withdraw like $10,000 from the bank, you have to tell the IRS you're doing that. Okay, next slide, please. So when numbers seem to be almost designed to avoid certain regulations, it can be a clue that fraud is going on. And in this case, yes, we probably have enough evidence to suggest that there is fraud going on with Nefarious Inc. And someone at the company is fudging the expense reports. And as a result, that's causing stock price manipulation that Pete the Peacock is using to buy stocks at optimal times. Okay, next slide, please. Okay. I'm sorry, Priya, this is the third scene. Okay. Thank you. And also, for the people who are like, oh, nice tips to commit tax fraud, we are not teaching you how to commit tax fraud. So please don't do that. It's illegal. This is about catching fraud. So forensic accounting, that's a very interesting field. And like some of the things I'm teaching you guys today, this is how actual crimes have been caught. Okay, next slide. Yeah, we're starting next. Clearly, Nefarious Inc's expenses are fake. But all of these reports have been signed by the CEO, Quimby Dacuca. The buck stops at the top. He must be who is behind this. I guess we can arrest both Pete and Quimby now. Let's head over to the address listed on that envelope addressed to Pete. It looks like Pete the Peacock is in the wind. He must have known something was up when he didn't receive an envelope as planned. But peacocks can barely fly. For Pete's sake, I'm saying he's on the run. Do we have any contact information, a phone number we can use to track him? Nothing on file. Nothing on file. But if there's one thing animals like Pete and Quimby are guilty of, it's being creatures of habit. If Quimby does not see any large purchases of Nefarious Inc's stock at 2 p.m., he will call Pete to make sure he makes the purchase. I'm sure Quimby is calling him right now. I have some friends who can take a look at his call logs and wiretap his phone. After making some calls, it seems that he made a call at 2.05 p.m. through a satellite phone. But we can't identify where the call went. He must be routing it through some hidden method. Of course, I saw an old decommissioned satellite on Nefarious Inc's expense report. Quimby must be using that to route his call to Pete. If that's the case, then we need to piggyback on the signal and send our own messages to figure out where the satellite sends a response. Piggyback? But he's a quokka. Displace the call. Someone's asking, what if you just use a burner phone? I think we're getting too in-depth into this world. Let's take it back a notch, but you can also track burner phones. Okay, so sending data on a noisy channel. In fields like electrical engineering and maybe subfields like wireless communication, a signal refers to any information that's being communicated, and a channel is the medium over which it is communicated. For example, if I send a cat meme to my sister over the internet, the signal is the bits that the cat meme I'm sending is comprised of, and the channel used is Wi-Fi and also the whole system of internet network cables between us. The problem with a channel like Wi-Fi is that there can be lots of interference. For example, other Wi-Fi signals, Bluetooth signals, even if I'm running my microwave. This interference is called noise. While Wi-Fi may be noisier than an Ethernet cable, no channel is noise-free. If I have an Ethernet cable that gets too hot, or maybe I'm putting a magnet close to my computer, you will see that some of the bits will start to flip and change, and that noise will disrupt the signal. Noise is bad for a signal because it can randomly change some of the data, maybe even silently, without causing any other errors in the system. Noise makes it difficult to know what the original signal was. The cat meme I sent could just become a mess of distorted pixels. Now, imagine a signal sent from a satellite in space. It travels through several miles of air to reach us, where other signals, solar flares, and weather can cause interference. So how can we preserve the integrity of our signal? That is, how do we ensure it can be received properly, or let the receiver know that the signal has not been received properly? Okay, I see some interesting ideas in the chat. Redundancy. Yes. How do we know if the picture was just disrupted by noise? Exactly. So that is the question, right? How do we know if the signal has actually been transmitted successfully, or did we just receive something that's garbage? Next slide, please. So here's an idea. We can send each digit multiple times. For example, if we want to send the number 7, we can send 777 instead, right? We just send that digit three times. That way, even if the signal is disrupted and some of its numbers are changed, if I receive a signal of 677, I know that since the majority of the numbers were 7, the sender likely sent a 7. This is called an N equals 3 repetition code, and it's a simple error detection and recovery encoding. For every three digits received, the receiver extracts the digit that appeared the majority of the time. Unfortunately, the original number cannot be recovered if two or three of the digits change. So going back to our previous example, if we had 777 and we received 677, that's a clear sign that there was an error, right? Because all three digits were not the same. But it also allows us to recover the original number and realize that, oh, hey, it's 7. Now it's not foolproof, right? If all the digits somehow got flipped to 5, right, and it's 555, there's always a chance of that happening, right? Then maybe we receive 5. We don't even know that there is an error, but the chances of that happening are hopefully very small. Okay, next slide, please. So the probability of sending a single digit through space and receiving it unchanged is 90%. If we use the N equals 3 encoding, that is, you know, we send each digit three times and the receiver extracts the most frequently sent digit, what is the probability that the receiver can successfully recover the sent digit? Yeah, and you can see the percentages, A, 72.9%, B, 97.2%, C, 92.7%, and D, 90%. Yes, so Craig is saying make it even more digits. That's definitely always an option, yeah, for encoding, yeah. Part of the picture that we received in the letter was just disrupted by noise while being printed out. It wasn't printed out, and so that's exactly why it wasn't sent through just a piece of paper that was printed out. It was sent over a USB drive, so it wasn't disrupted. And someone else was suggesting a code that the receiver knows it should receive. So yeah, the receiver has to know the encoding, yeah, that's being sent. And Hamming codes, yes, I love the suggestion about Hamming codes. Please write more if you want in the chat, yes, interested in seeing, you know, your thoughts or experience or what do you know about Hamming codes? Yeah, or sending the message multiple times, yes. You don't have to buy private satellites for private phone calls. There are encrypt signals, Signal, WhatsApp, you know, they're apps that are supposed to be encrypted. You can also use a satellite phone without buying a satellite. Isn't looking in people's mail without concrete proof they're committing a crime illegal? Well, the USB drive fell out and it looked suspicious, so that was enough proof. You don't need a preponderance of evidence because this is not a criminal thing, initially it was a civil issue, right, so you just need reasonable doubt. 20 more seconds, just waiting on like a few more people, maybe like five seconds, just kidding. I love how invested we are in a fake world. Yes, and people are asking about like, is there a probable cause to search me now? They have privacy rights, they have rights to privacy, that's true. Look, if they didn't follow the law, maybe the COCA can sue them. Yeah, and here's the thing, committing fraud through the mail, that's called wire fraud and Ms. Paws and Mr. Whiskers seem to already have a suspicion. Maybe it seems that Pete the Peacock was already under investigation. His stock returns seemed too high, so yeah. I'm waiting on one more person, so three, two, one. Next slide, please. OK, so if the sender sends a single number like one, then the N equals three encoding means that the sender will actually send one, one, one. Now, if the receiver receives one, one, one, they can successfully determine that one was sent and there's a 90% times 90% times 90% equals a 72.9% chance of this happening. OK, but our N equals three encoding can also tolerate one error, right? So we could also receive X, one, one, one, X, one or one, one, X, where X represents an error, which, you know, basically any digit besides one, and we can successfully recover that the digit sent was one. So we can still successfully recover it in this case. Next slide, please. And so the probability that X, one, one is what was received basically is 100 minus 90%, right? That 10% is the probability that we have an error times 90%, the probability we received the one correctly times 90%, again, the probability we received the one correctly. So that's 8.1%. And similarly, the probability that we, you know, one X, one was what was received is 90% times 10% times 90%, that's 8.1% as well. And similarly, the probability that one, one X is sent is 90% times 90% times 10%, so again, 8.1%. So next slide, please. So that means the probability of receiving any of those three combinations was 24.3%, or you can also recognize the symmetry in the problem and just simply calculate three times 10% times 90% times 90% equals to 24.3%. Let me know if there's any questions about that. But if we just add this to the probability of receiving one, one, one, which was 72.9%, that gives us a total probability of 97.2%. So that's the probability that we can recover the actual value that was sent. Yeah. So error detection and encoding is a very rich field of study, and the goal is to send the minimal extra information needed beyond the original signal while still being able to retrieve the original signal, because if you send too many additional, you know, too much additional information, those are called error checking bits, that also increases the probability of an error in those bits. So then your error checking bits then may need their own error checking bits and then it can go on forever. So you have to figure out using probabilities, you know, what's the probability there'll be noise that will disrupt something, maybe flip a bit or something like that, and calculate like, OK, what's the minimum number of bits I can send and what kind of clever encoding schemes can I use to try to send the fewest extra bits possible while still being able to, like, either receive the original message correctly or detect that there is an error and possibly recover what the original message was supposed to be. Yeah. OK. Next slide, please. Question in the chat quickly. Why would you do 100 minus 90%? Yeah. So the probability that we actually get the correct digit sent, right, is 90%. So that's why originally we were doing 90% times 90% times 90%, the probability that we received all three digits correctly. But it is also possible that, you know, we receive one digit that's incorrect and two digits that are correct. So what's the probability that we receive one digit that's incorrect? What's the probability that we didn't receive one digit correctly? Right. So since the probability that we receive a digit correctly is 90%, that means, you know, if you subtract it from our total of 100%, right, or one, that means the probability that we received a digit incorrectly is 10%. Does that make sense? Hopefully that makes sense. OK, so, yes. OK, great. So 10% of the probability that we receive a digit incorrectly, 90% still the probability that we receive a digit correctly. So we receive two digits that are correct, one digit that's incorrect. We can still recover the original digit sent. And there's three ways that one digit can be incorrect and two digits can be correct. So that's why we do three times 10% times 90% times 90% and add it to our original probability of receiving all three bits correctly in the first place. And that gives us 97.2%, which is still kind of a high rate of error. Right. Imagine if like, you know, one out of every like 33 things you send has an error in it. But yes, existing codes, actual codes out there are much better. And also like your error rate is probably going to be much less. So how would the receiver know that the two correct ones aren't just errors that have been flipped? Exactly. You don't know. Right. It's possible that two of the digits have been flipped in exactly the same direction. And there's no way to know that. But that's part of that 2.8% chance that you're not able to successfully recover the correct digit that was sent. And so that's why you need, you know, more clever encoding schemes. So next slide, please. So repetition is actually not the most efficient encoding strategy, because, for example, in what we did here with n equals three repetition, we send triple the information, right? That's a lot of data to send and receive. But most encoding strategies will make it possible to detect errors in multiple pieces of information. So, for example, Hamming codes, that's a very popular encoding mechanism, especially for sending like satellite signals. And someone mentioned this in the chat. You might have like a five comma three encoding scheme, which means that for every five bits of information you send, you have three bits of error detection. And so if you think about it, there's far fewer bits of error detection than there are that were being sent, because basically you combine some of the bits that are being sent and the error detection bit. And like each error detection bit basically covers like two or three of the original bits that are being sent. Hopefully that makes sense. OK, next slide, please. Yeah, OK. For example, like a checksum is another strategy that can be used. It can be sent at the end of the message and it sums up all the digits sent. And so this can help us determine if any of the digits received were incorrect. But it is possible, right, if like one digit went down by one, one digit went up by one, you know, your checksum would still end up being correct. So it's not foolproof, but there are very clever mechanisms such as Hamming codes for error detection. Yeah. OK, let me just go through the chat. The cables, making cables out of lead to prevent external radiation, there still could be issues with the, you know, material themselves. Right. Nothing is foolproof. So there could still be noise with lead cables. Yeah, OK. Someone typed up really interesting information about Hamming codes in the chat. I'll just copy and paste that here. Thank you. Great. Someone saying you could send twice. But the thing is, like, you could also have errors twice. Right. So, yeah, Hamming codes are reserving a few bits from each packet to make a specific subset in the bits and the packet as a whole have an even number of ones. If one bit is flipped, it can be identified and corrected by looking at which subsets have an odd number of ones. If multiple bits are flipped in a single packet, they cannot be recovered. But there is a very high probability of detecting that there was an error. And then you can always ask for the packet to be resent. So the fraction of storage that's spent on confirmation decreases, the packet size increases, but it becomes more likely there will be multiple flip bits in a single packet and makes it impossible to recover. OK, next slide, please. So now we're listening to the satellite transmission for a phone number. And here's what the transmission picked up for the phone number. So I just put every group of three digits that we received. I grouped them together. So our first group is nine eight nine. The second group is one one zero three three four five five five. And you can read the rest of the numbers for yourself. Now, if we are saying that we are using n equals three repetition for our encoding scheme, what phone number can you extract from these digits? And can you try typing it in the chat? Okay, I think a lot of people are getting it. Next slide, please. Yeah, so the original number was 913-555-0226. So that's enough to figure out what Pete the Peacock's number is and we can call him and try to trace the call. Next slide, please. Mr. Brigitte, did you send me the link? Oh, sorry, let me send you the link. Okay, I just sent you the link. Looks like the number is 702-555-9864. If I call this number, can you triangulate the location? Just keep him on the line for 30 seconds and he's as good as ours. And that's 30 seconds. We have his location. We'll send the SEC to pick up both Quimby and Pete. That's one bird that's not going to be able to fly the coop. Looks like we got them, guys. Thank you for all your help. Okay, so stick around. I'm going to be launching polls for name, email, and grade. So make sure you stay until the end. This is so that we're done with all the quiz questions, but this is so that we know who the winners are. There's no really easy way the way Zoom is set up to calculate who the winner is right off the bat, so it will happen after. And Math Kangaroo will email you the prizes and stuff. While you're putting the name, just wanted to introduce our cast of characters. So this was when we were participating in Math Kangaroo, Dr. Roo was the prize that we got. Maybe you guys got stuffed kangaroos as well. And you can see the T-shirt, 2008. I don't know. Maybe some of you were born. Maybe not. But, yeah. Oops, sorry. Let me launch this poll again. I think I closed it before. I thought everyone had given. Wait, let me try this again. Nope, nope, nope, nope. Relaunch poll. Can everyone again submit your name? Sorry. I'll just keep it open. Even if you submitted your name, can you please just submit it again? Sorry. And just your name. Don't put your email or grade. Some questions of if you got a question wrong, will you get the prize? Not sure. Depends on how well other people do. So it's possible, and it might not be. If you want to learn other cool topics, we do have a YouTube channel, Mr. Begirth, if you can send out the link to everyone. Thank you. So you can see a lot more of Ms. Paws and Mr. Whiskers, if you want, and learn more stuff. Mad Kangaroo will be sending a coupon for our AI and Visual Arts Camp. It normally costs $50, but you will get an additional coupon, and this is true for everyone with us today. So I'll just play the video quickly. Let me just... Charlie, what are you watching? Ms. Paws, I'm improving upon dog's play poker. Charlie, what are you up to? Can you see the screen? Ms. Paws, I'm improving upon dog's play poker and creating cat's play poker. Hmm, I guess I can see it if I turn my head a bit. You know, if you sought the help of AI, Charlie, maybe your paintings might turn out a little better. If you take MethaPlus's AI and Visual Arts Camp, you'll learn about the different AI models used to create, restore, and authenticate different types of visual art, including paintings, sculptures, and film. You'll be creating your own AI art gallery, as well as participating in the MethaPlus's AI plus Visual Arts startup pitch. So what do you think, Charlie? Charlie? Charlie! He's daydreaming again, isn't he? So that's... That's our AI and Visual Arts. I'm waiting for four more people to answer the name question. I saw more. So please, please do fill that out. I'm going to... Let me share screen again. Additionally... And so this one is for grades 6 to 12. If you are grade 7 to 12, you also get a coupon to the Intro to Ethical Hacking Camp. If you learn how to hack and how to defend yourself, Mr. Brigitte, do please send the link and I will share the video for that. You have been hacked. You have 24 hours to send a million dollars of bitcoin before all your data gets destroyed forever. Ms. Voss, I'm scared. I had the most horrible nightmare that I was hacked. Charlie, what did you expect? I've told you a million times not to watch scary movies right before bedtime. But you know, Charlie, you wouldn't be so scared if you knew how hackers hack and how to defend yourself. If you take MethaPlus's self-paced Introduction to Ethical Hacking Camp, you'll have access to 14 hands-on labs which will teach you how to steal passwords, launch Trojan horse attacks, hijack browsers, and more. For your final project, you will evaluate your own system security and write a penetration test report so that you can defend your devices. So, Charlie, what do you think? Do you want to be an ethical hacker? Charlie? I guess he's already fast asleep. And this class will also teach you why just plugging in random USB drives that you find in envelopes is not a good idea. I like what Ms. Voss said. I'm waiting for one last person in the name question. If I don't get a response in the next five seconds, I'm just closing the poll. Five, four, three, two, one, zero. So, we actually answer that. If you take the course, we answer this like, you know, is this a good idea or bad idea to teach hacking? This is definitely a conversation that comes up. But the thing is the bad guys already know all of this stuff, so you're not really teaching anything new to them. They know a lot more. But at least you have an idea of how to protect yourself. So, I'm closing the name question. I'm going to reshare the screen here. What does stealing password mean? Meaning someone else's password, you are able to get access and then start logging into your stuff. Well, so we also answer that in the course. It is illegal to hack, but ethical hacking, if you are hired by the company and pre-approved to do that, you get paid to do it. So, now I'm going to ask you what your grade is just so that we have that information. Ah. Oops. Yeah. Oops, let me, sorry, sorry. Let me try that again. Okay, make sure to answer what grade you are in. Another thing, if you are interested in learning about AI, you know, that's a hot topic, AI and machine learning research. If you're interested in participating in science fairs and completing a research project for that, this one is for people who are currently in grades eight to 12. It's six weeks in the summer. It's virtual and it's live. And you get to do a research paper with a partnership of a university. So, you get a professor who's a mentor and you will write your own research paper, create your own research poster, and then present it in front of professors and your parents. And alumni of this camp have gone on to submit in science fairs and conferences. They're presented, you know, as far as Tokyo in Japan. And they have one state level science fairs. So, currently, yeah, all of our courses are a mixture of CS. Hopefully at some point we will incorporate more. But in terms of mentorship, if you need one-on-one tutoring or something, then definitely we offer tutors for non-CS subjects. This one is, yeah, eight to 12. This is not for grade six. AI and visual arts is the grade six one. I'm going to end the poll here for the grade question. Then if you are interested in competitive programming, Mr. Bagheer, are you sending these links? Okay, yeah, cool. This is a two-week camp. It's virtual. It's live during the summer. And if you are interested in participating in like USACO or like Computing Olympiads, then this is a great camp for you. If you love math, hello, then this is also a great camp because it teaches data structures and algorithms and how all the math works behind CS. This is perhaps one of the most important courses that you will take the equivalent of this at the university level because the stuff that you learn in this camp, that's what coding interviews ask about. And coding interviews are basically when you want to have an internship or a job in computer science. Besides asking you, you know, like what projects have you worked on? They ask you difficult coding questions and you're supposed to write on the board and think through it live and explain to your interviewer, you know, how to code up something and how it is efficient and all those kinds of things. And a lot of universities don't do a great job teaching this. And so this is a great camp for that. And, you know, I can't emphasize that if you go into computer science, right, you're looking at perhaps a six figure salary. So you really want to ace that coding interview. And so this is a great camp for that. And lastly, the most important one here that I want to talk about is in fall of 2024, you know, maybe did any of you did this last year? We partnered up with Math Kangaroo to offer an AI course. So that was sort of a preview. This year, we're going to do the same thing, but it's going to become even bigger. It's going to be longer, the course. And so there will be an AI Math Kangaroo course as well a competition that is followed by it. And so it's going to be the premier AI competition for middle school and high school students. It's something that you can put in your college application that you did this course. And then if you do really well in the competition. So this is super, super exciting. So last year it was free because it was a shorter course. But this year it's it will be paid. I believe it will be $99. So and it will be eight classes. So, yeah, definitely recommend joining this so you can see us in the fall. And a couple of. Oh, there isn't a link yet, but the Math Kangaroo CEO will send it out soon. Probably like end of summer, beginning of fall. So, yeah, make sure to keep your schedules clear for it. We're not really sure yet what the exact dates are, but all of that will go out soon. And then, yeah, we had some people do this last time I got in the chat. Yeah, definitely. We hope to see you. You guys are a great audience. Hopefully we will see you soon in one of our classes. And yeah, make sure to put in your emails. I will close it in 10 seconds. And I think the CEO will get in touch with you in maybe two-ish weeks with the prizes and stuff of Math Kangaroo. Yeah, who won the prizes? You'll get an email if you did win the prize. I'm waiting for one more person, but if you have finished filling out the poll, feel free to head out. Have a great Memorial three-day weekend. And hopefully we'll see you guys soon. Yeah, thank you all for coming.
Video Summary
In this video presentation, two speakers from MethaPlus, Mr. Bagheer and Ms. Haripriya, both with strong educational backgrounds in electrical engineering and computer science, discuss several intriguing topics. They began by congratulating Math Kangaroo competition participants and shared their own accolades in the same competition, establishing their math competence.<br /><br />The discussion included digital color encoding using RGB values and how computers store these using binary systems. They delved into topics like color representation in binary, how color perception works in humans, and the limits of color differentiation.<br /><br />The session introduced steganography—hiding messages within digital images—and demonstrated how slight pixel changes could conceal another image, showcasing an example of insider trading.<br /><br />Next, they explored Benford's Law for fraud detection, evidencing how certain numerals appear disproportionately in natural data sets and how deviations could suggest fraudulent activity.<br /><br />Further, the video showcased encoding and error detection strategies for reliable communication over noisy channels using methods like repetition coding and more efficient alternatives like Hamming codes, concluding with the use of these techniques to solve a fictional mystery involving insider trading.<br /><br />The presentation included educational opportunities through MethaPlus, offering camps and courses on AI, visual arts, competitive programming, and ethical hacking, aimed at enhancing skills in relevant tech fields. They also hinted at an upcoming AI course and competition organized with Math Kangaroo, ideal for middle and high school students.
Keywords
MethaPlus
RGB encoding
binary systems
steganography
Benford's Law
fraud detection
Hamming codes
insider trading
educational camps
AI competition
×
Please select your language
1
English