Historically, antimalware tools rely on information about malware to accurately detect and prevent attacks. Security specialists develop new signatures and distribute to customers regularly. Once the antimalware tools have been updated with the new signatures, the behavior of all known malicious code will be thwarted. For quite a while, this method proved to be very effective. The current trend of malware writers is the prolific production of new variants of existing malware which will be potentially invisible to existing signatures. Over 1 million new malware variants have been observed per day. It is noteworthy that over 90% of existing cyber-attacks utilize malware and this severely challenges the effectiveness of signature-based antimalware tools. Moreover, signature development requires specific knowledge of a malware application. This suggests that as brand-new malware and variants emerge, signatures are subsequently developed, distributed and deployed. This results in a window of opportunity for new, never seen, (also known as zero-day) malware to successfully exploit inadequately protected systems.
The signature model is variously referred to as pattern matching or byte matching which is an apt designation based on the applied methodology. As a model that has been in use for decades, its strengths and weaknesses are well known to security practitioners and adversaries alike. Consequently, contemporary security vendors have enhanced signature technology with other approaches and the most common of these are explained below.
Within the context of an antivirus/antimalware (AV) tool, rules are employed to match artifacts and properties that are typically inconsistent with an otherwise benign application.
Generally, the process to locate a match is weighted so that when thresholds are exceeded, preemptive action may be taken. Outside the context of AV, heuristic analysis tools may be used to identify potential malicious code.
This approach targets behavior exhibited by malware. Several behaviours exist that point to potential danger. Where behavior is examined and assessed as the code executes, the process is known as dynamic analysis. Conversely, static analysis is performed pre-execution and examines the code for malicious intent. Specimen areas of examination include actions performed by the code on the file system, registry, processes and the network.
Hashing is a technique that takes a file of any size and type as input and produce a fixed length output known as a hash. Furthermore, there is no way to recover the original file from its corresponding hash. Popular algorithms used for producing hashes include MD5, SHA256 and CRC32. Any change to the original file, no matter how trivial, will result in a completely different hash when processed by a hash function. Many AV applications can apply a hash function to different parts of file and subsequently analyze the hashes to see if matches with hashes of known malware occurs. This allows the AV to accurately determine whether or not a piece of code is malware.
Defence in Depth (DiD) has a military origin and was developed to defend physical assets by employing defence layers that compel the attacker expend substantial resources. According to the SANS Institute, the tactical goal is to delay and render the enemy attack unsustainable thus giving the defender opportunity to counterattack and eliminate the threat. The effectiveness of an adaptation of this approach and its application to the digital world remains an ardently debatable topic.
Before proceeding further, it is useful to point out that there is a distinction between a layered defence approach and DiD. Today, multiple information security tools exist with fairly specialized defense mechanisms and while such tools may be effective, they are not comprehensive. Hence, an attacker could potentially compromise systems by exploiting exposed vulnerabilities. The layered approach prescribes a simultaneous, cohesive application of different defences that together should close all security gaps.
DiD is comprehensive, broader in scope and include layered defence elements. As stated by the US Department of Homeland Security, DiD employs a holistic approach to protect all assets, fully considering interconnections and dependencies and utilizing organizational resources based on cyber security risk exposure.
There are multiple approaches to the DiD strategy with diverse nomenclature being used. Notwithstanding this plurality, the fundamental tenant remains – the use of multiple methods of defence measures at different segments to proactively protect and security digital assets.
Machine Learning (ML) has been a reality since the 1950’s and Arthur Samuel coined the label in 1959 and defined ML as a field of study that gives computers the ability to learn without being explicitly programmed. Computers posses the ability to perform calculations and processing tasks with levels of precision, performance and persistence that far exceed the average human capability. This is achieved through instructions comprising a program. On the other hand, the notion of ML requires that lots of data corresponding to past events is fed into the computer from which learning takes place. There are many mathematical models available for achieving this outcome but here is a simple example.
We want to teach a computer how to detect spam e-mails. In this scenario, we have looked at five hundred emails over a period of time and determined that one hundred of these emails are spam. Analysis of the spam emails, we observed that eighty of the emails flagged as spam contain the word “free” and twenty of the emails not flagged as spam contain the word “free”. So, we focus on the emails that contain the word “free” and set aside the other email messages. Hence the probability is 80% that an email containing the word “free” is spam and 20% that it is not spam. This would become the basis of a rule that is used for predicting whether or not an email is spam. If it has also been determined that the probabilities of a spam email containing a spelling mistake and missing a subject are 70% and 95% respectively, the system can combine all these criteria to learn how to recognize spam. This is an example of the Naïve Bayes algorithm.
Another learning technique used for classification is the Decision Tree. While the focus here is on classification, bear in mind that decision trees may also be used when working to solve regression problems. Structurally, a decision tree consists of nodes that are linked in a specific way. Each node is used to represent an attribute or feature and each link is used to represent a decision or rule. The nodes that are connected by a single link are known as terminal nodes and are used to represent the outcome.
The following diagram referenced from the book “Introduction to Artificial Intelligence for Security Professionals”,illustrates the methodology behind a decision tree. In this scenario, the decision tree is used to identify malicious (uniform resource locator) URLs which are denoted by outcomes greater than zero. The tree was constructed with training samples consisting of a mix of malevolent and benevolent URLs. Firstly, all the URLs were divided based on the value of the age attribute and this exercise produced two distinct branches of the tree.
TRUSTWORTHY Systems Inc. has been helping business organizations acquire, configure and maintain the most effective endpoint security solutions available since 2004. Our team of specialists has worked with organizations in the financial, health, retail, hospitality, legal, education and manufacturing sectors helping them secure their endpoints and data from malicious attack. Our team is trained, certified and authorized to deliver endpoint security solutions from Cylance and McAfee.
Cylance is a pioneer of using ML to protect client computers, mobile devices and servers from file-based malware. The Cylance solution utilizes a math engine that divides a single file into an astronomical number of characteristics and analyzes each one against hundreds of millions of other files to reach a decision about the normalcy of each characteristic. This is how the Cylance engine accurately identifies malware — whether packed or not, known, or unknown. The model eliminates the traditional application of signatures and frequent updates.
McAfee is a long standing security vendor for the protection of digital assets and remains among the top three endpoint protection providers according to a Gartner 2018 report. The McAfee Endpoint 10 solution delivers threat protection, firewall and web control modules to protect the data on several types of digital platforms. The structure includes a foundation of common components which service the modules. This structure provides cohesive communication and operation among the modules which significantly enhances performance and security. Endpoints are managed through the award winning ePolicy Orchestrator (ePO) providing security professionals with visibility and control across the IT landscape.
Let TRUSTWORTHY Systems Inc. be the catalyst for risky behavior, change and development of a better digital experience within your organization.