All the fundamentals - How the Internet works, networks, file sharing protocols, and more on TryHackMe

Over the past week, I’ve been hacking away on tryhackme.com. With over half a million users, TryHackMe was founded by 2 cybersecurity students Ashu and Ben at a summer internship.

“(We) found learning security to be a fragmented, inaccessible and difficult experience”

They had a brilliant idea - Sam created a way to deploy machines and Ashu suggested uploading all their notes to a centralized platform which others could have access to. And that is essentially TryHackMe - an added benefit is the extremely intuitive user interface.

TryHackMe has rooms that are focused on specific topics, sections that are a collection of rooms forming a broader topic (e.g. Linux fundamentals), and finally - learning paths that are a collection of sections. Each room contains many short response questions. If you get all of them right, you complete the room. In order to get the questions right, you need to sometimes hack a target machine on the TryHackMe servers. They offer an Attack Machine that you can use online to attack the TryHackMe target machine, or alternatively - access the target machine from a local machine over the Internet, by first connecting to TryHackMe’s sever through openVPN.

I’m currently on the Complete Beginner learning path, that gives a broad introduction to topics in cybersecurity. Once I complete this path, I can download a certificate - which would feel pretty good, as it is quite the challenge for those new to the technicalities of hacking (like me).

Linux Fundamentals

The best tutorial on Linux I’ve ever found - was TryHackMe’s 3 room section on Linux fundamentals.

Room 1 deals with basic Linux commands, finding files and content within files, and shell operators. The very basic Linux commands are below:

https://tryhackme.com/room/linuxfundamentalspart1

For finding content within files, grep is a very useful command that searches for content within file lines that match a regular expression. This command can be useful for example finding an unencrypted password in a plain .txt file directly on the terminal, without going through the process of actually opening the file and searching.

The cat command outputs the contents of a file. In a similar vein as the grep command, the cat command makes it convenient to display small files that maybe contain authentication details on the terminal window.

Room 2 deals with remote access through SSH, essential Linux directories, and file permissions.

The /etc folder contains information about the username and password, stored in encrypted format. The /var/log folder contains log files from running applications. This is useful for an administrator to keep track of any issues from running applications, or signatures of unusual activities. The /root folder is the home directory of the super user. The /tmp folder is a temporary folder. It’s also a good place for hackers to store scripts that they use to find vulnerabilities in networks once they are already inside a compromised system.

Room 3 talks about terminal text editors like nano and vim, package management through installing and maintaining software, automating processes, and more miscellaneous tools.

Networks and Exploitation

TryHackMe does a really good job of explaining how computers connect to the Internet, the differences between various services that connect to the Internet, and how hackers exploit various vulnerabilities.

Room 1 discusses the high-level OSI and TCP/IP models for how network connections occur, and some basic networking tools.

OSI Model | https://en.wikipedia.org/wiki/OSI_model#cite_note-20

The OSI model is a conceptual model that standardizes Internet communication using standard protocols.

Layer 7 is where the user interacts with the computer for inputting data and visualizing the outputs.

Layer 6 transforms the data from the application layer into a format that is understood by the receiving computer (e.g. encryptions/compression/other transforms to the data).

The session layer 5 represents setting up connections with the other computer across a network. This is what allows sending multiple requests to different endpoints all at once without getting mixed up (as in handling multiple browser tabs).

Layer 4 is choosing over which format the data is transferred over. The 2 most common Internet protocols are TCP and UDP. TCP is connection based, meaning that a connection needs to be established between the computers during the duration of the request - this is useful for reliable transmission to ensure all the data sent is received. UDP on the other hand essentially throws packets of data at the receiving computer. This is useful for situations where speed is more favored rather than accuracy such as video streaming.

Layer 3 is the network layer, responsible for finding the location of your request. When you make a request to say google.com, the network layer is responsible for figuring out the IP address, as well as which route to take through the Internet.

Layer 2 is the data link, which presents the data in a suitable format for the receiving computer.

Finally, the physical layer 1 decodes the information and represents it on the computer through the computer hardware. The electrical signals that make up data are sent and received in this layer.

Room 2 is an in depth look at Nmap, one of the most powerful network scanning tools.

In going through Nmap, this room does an excellent job in reviewing how network connections are made between your computer and another server - through ports. When you are running various applications, you need a way to direct the traffic to the appropriate service. Ports are the solution to this. Network connections are made between 2 ports: an open port listening on the server and a randomly selected port on your computer. For example, when you connect to a website, port 443 is usually used on the server side to send data over https (port 80 for http). Whereas on your computer, a random port is selected to receive data. This is also the case when multiple browser windows are opened, and multiple ports are simultaneously opened on your computer side, so as to keep the various received data separate.

Linux command sockstat illustrating local and remote port numbers | Skanda Vivek

Nmap is used to discover open ports, the services they are running and obtain more reconnaissance information often used as a first step in the hacking process. Read my other article for more on reconnaissance in cyber attacks:

https://medium.com/emergent-phenomena/what-ingredients-make-a-successful-cyber-attack-part-1-reconnaissance-379f7353014c

There are 3 basic scan types in Nmap - TCP, UDP, and SYN. I’ve talked about TCP and UDP protocols and in what context they are used to send information over the Internet. SYN is a stealthy version of the TCP scan, wherein the receiver does not acknowledge that they have received information from the sender, thereby being able to bypass older intrusion detection systems.

Now we know the protocols to connect computer networks over the Internet, as well as how to detect these protocols using Nmap. Rooms 3 and 4 in the network exploitation series are focused on exploiting these networks to gain credentials of users for important services like remote SSH logins.

Room 3 focuses on 3 specific protocols for transferring files: SMB, Telnet, and FTP. Every exploit contains 2 parts: first enumerating protocols on the target network by using tools like Nmap to get publicly available information, and second using this information to figure out credentials through clever insights into common misconfigurations, and password attacks.

SMB stands for Server Message Block protocol, used for sharing access to files, and devices like printers and other resources on a network. This part focuses on a common SMB misconfiguration- anonymous sharing without a required password.

Telnet is an application server used to connect and execute commands remotely and sends all messages in clear text. Telnet has been replaced with SSH in most cases. This part introduces us to reverse shells, which are quite brilliant.

Shells are basically terminals for communicating with a machine. On your local Linux machine, you can start a shell anytime and execute a command such as reading a file in a directory. However, remote shell access is different. Remember, all the information you obtain from a shell is over the Internet, specifically over ports. While SSHing, you are not literally mirroring a remote terminal, but rather - getting information that is sent back from the server.

If you login through SSH (short for secure shell) with the appropriate credentials, you can execute commands as if you were running them locally. So in this case, you wouldn’t have to worry about the shell.

However, remote shells are not allowed for all ports or connection types. You might encounter a case wherein you get access to a remote server, but are unable to send and receive information through a shell. In regular remote shells, the user initiates a communication with the server, and the server listens. A reverse shell is the opposite. The server creates a session over a certain port to send data, and the user listens over that port. The attacker can blend this traffic with http traffic by sending data over port 80, that their local computer listens on.

The FTP protocol is used for remote transfer of files over a network. This part teaches how to perform password dictionary attacks to figure out FTP credentials using Hydra, a tool for such attacks against various protocols.

Room 4 discusses more network exploitations against Network File System (NFS), the Simple Mail Transfer Protocol (SMTP), and MySQL. This room introduces Metasploit, an excellent penetration testing tool for carrying out enumeration and attacks on these various protocols.

What’s Kali Linux?

Kali Linux is a Linux flavor designed for hackers. It comes with most hacking programs pre-installed. Nmap, Metasploit, Hydra, and much much more are literally a single line of code away.

While TryHackMe provides an Attack Machine, it is extremely slow especially if you are running nmap -p- [IP] which goes through all 65,535 possible ports. They provide an alternative to sign into their machine through OpenVPN. I ended up installing Kali Linux on a VMware virtual machine, and connecting to TryHackMe’s machine through OpenVPN. This is working great so far, and I even have a local version of Kali Linux to play around on.

Final thoughts

Often, hacking is viewed as taboo - on the borderline of what’s legal and what’s not. For example, if I naively used the same methods that I learnt from TryHackMe on a public IP address I could be in big trouble. But this is the chicken egg scenario - how does one learn hacking when hacking is problematic (and for good reason too). While this is fine if you are working in a company and have access to dedicated machines for penetration testing and honing skills, this is not necessary the case if you are new to the field and wanting to skill up.

My experience in data science has taught me that access to data and computational resources is becoming increasingly available and reducing barriers to success. Why not make hacking similarly available to all? This is especially important as cybersecurity is so important in today’s world and there is a critical lack of cybersecurity professionals.

TryHackMe is basically the Google Colab equivalent for hacking.

In this article, I’ve summarized what I’ve learnt from TryHackMe over the past week in the broader context of hacking and computer networks in general. I hope this inspires your journey in the field of cybersecurity, or helps in recapping some of the basic computer and networks fundamentals in an interesting and relevant context.

The larger goal of this series is to connect what hackers do on the technical side (as I talk about in this article), to the cascading societal consequences from cyber attacks. I’ve shown in my recent research how to quantify the consequences of hacked vehicles on transportation infrastructures, and cascading delays arising from targeted attacks on air infrastructures. The lessons learned can hopefully help realize cyber-resilient societies.

Follow me if you liked this article. I write about my experiences as a researcher and entrepreneur at the interface of data science, cybersecurity, and the society.