I've been reading and coding my way through the book "Black Hat Python" by Justin Seitz and really enjoying it. I'm only about halfway through it s far, but I've enjoyed it so much I wanted to share my experience so far.
Checkpoint - Where am I now?
The "Capstone Project" of Chapters 1-4 combines a lot of things in the text that leads up to it. It's not billed as such, but this is how I see it. The book starts with basics of networking and you practice working with sockets through Python and doing a lot of things very manually in the first couple of chapters. This is followed by a whirlwind introduction to Scapy. Scapy is just terrific and I think this approach was a clever way for the author to demonstrate that. This post is less of a summary of this stuff and will focus just on what I am calling the "Capstone Project".
Capstone
The basic idea is this two step attack which is very 1984 or otherwise potentially nefarious but also a pretty interesting challenge.
- ARP cache poison attack to insert ourselves as MITM and capture victim traffic to a pcap
- Parse this capture offline, and extract all the images out of the HTTP traffic present in the data, and then use opencv to detect faces within the images, and if found, render an edited image with the detected face highlighted
Attack Details
There's a lot going on here, and a lot of details between the start and end of this attack 'kill chain', so let's look in a little more detail.
ARP Poisoning Attack1. first step of the arp cache poison attack is to get the current state of the later 2 network so we can restore things when the attack is over or cleanup on exit
2. next, we abuse the ARP protocol, and send gratuitous ARPs to the gateway and victim in order to spoof our mac address in each local table. We want the gateway to believe we are the victim, and the victim to believe we are the gateway.
3. now that we are in the middle we must be configured to forward ip traffic
4. next we sniff traffic to capture the victim traffic we are passing
5. finally, when we are happy with the traffic collected, we stop sniffing, and restore the layer 2 network by sending simple 'who-has' broadcasts to make the real hosts respond with their correct MAC addresses. There's also an arg to stop capturing after --count
packets.
Pic Carver1. first step of this is parsing the pcap with scapy, which is easy enough 2. next, we dissect the pcap into tcp sessions so we can follow tcp streams 3. then we walk the packets in the session and grab just the http traffic 4. now we parse out the headers and use that to infer where there is image data 5. since we found image data, we reassemble it from the tcp stream and http response (or request) using scapy 6. next we write the image data to disk as a standard image 7. now we start the second phase of the program and check the image we just wrote to see if it contains a face via opencv 8. if we detected a face in the picture, we use opencv again, this time to draw a rectangle on the image and indicate the face we detected 9. finally we write the image to disk as an edited 'face' image 10. once the whole pcap and all pictures are processed, we print out a summary and exit
Pics or it didn't happen
The say a picture is worth a thousand words so here we go.
bash
writing facial recognition edited image to: ./faces/bigface.pcap-pic_carver_face_50.jpeg
writing original image to: ./pictures/bigface.pcap-pic_carver_50.jpeg
Isn't this book old?
Yeah. Copyright is 2015, and here we are 5 years later. Python has changed a bit, and so has scapy. That happens.
I think I got a little extra from the book and exercises because it is a little older. For simpler things, all the examples worked with little or no modification in Python 3 (they were originally written for 2). This made it easy to jam in the code from the examples into an IDE and see things work, and start building a foundation without much friction. When I got to the more ambitious examples that strayed from stdlib stuff, things started to break. This meant that I ended up spending more time reading the docs and debugging things, and I think that experience really helps cement a concept.I ended up rewriting the core parsing of the pcap with scapy for the capstone project and that was a great, practical introduction to scapy. I was able to delete a function or two and reduce the code complexity a bit, thanks to the addition of the http layer into scapy; this was not available when the book was published. Now that it is a part of scapy though, it didn't make sense to manually unwind packets and use byte ranges and slices to carve out headers when you can instead use an instance of the HTTP layer and ask for parts of the packet via attributes. I also re-wrote some of the simpler examples using asyncio, and replaced some of the synchronous socket based stuff using the higher level streams approach to networking with asyncio. All of this made the experience of reading this book even more rewarding for me, because I ended up learning a lot that was not in the book!
Want to try it yourself?
Take a peek at the README in for this stuff. Grab the code and start playing around with your network! Take the general structure of this and change it to do different things; you can easily incorporate some different processing of the photos or any other data you might want to extract. One tip I recommend it to try simple things first. I made sure to find a website that was http only to make the parsing a little easier. Surprisingly, http://www.ucla.edu/ is all http only so it was easy to just click around their site and gather some traffic I knew would include pictures.
Thank you
I'd like to thank the author of the book for such a great read and interesting project. I'm really enjoying it and looking forward to the remaining chapters. I don't think I would have tried something like this if it weren't for this book.
I'd also like to thank you, the reader. If you have stuck with me all this way, congratulations! :) Please reach out to me on Twitter if you have any suggestions for stuff you'd like to see or feedback for me! :D