1996 IEEE Symposium on Security and Privacy A SENSE OF SELF FOR UNIX PROCESSES Stephanie Forrest, Steven A. Hofmeyr, Anil Somayaji, University of New Mexi Thomas A. Longstaff Carnegie-Mellon University A Sense of Self for Unix Processes OUTLINE - A SENSE OF SELF FOR UNI X PROCESSES Introduction Related Work Defining Self Experiments Building a normal database Distinguishing Between Processes
Anomalous Behavior Discussion Conclusions 2 A Sense of Self for Unix Processes INTRODUCTION The authors developed the security method based on the way natural immune systems distinguish self from other. An important prerequisite of such a system is an appropriate d efinition of self. Problem: what the authors mean by self in a computer sys tem seems at first is more dynamic than in the case of nat
ural immune systems. update software, edit files, add new programs, change persona l work habits, The normal behavior of the system is changed, sometimes dra matically, and a successful definition of self will need to acco mmodate these legitimate activities. 3 A Sense of Self for Unix Processes INTRODUCTION (CONTD) To identify self in such a way that the definition is sensitive t o dangerous foreign activities. Too narrow a definition will result in many false positives, while to o broad a definition of self will be tolerant of some unacceptable a ctivities (false negatives).
This paper aimed at establishing such a definition of self for Unix processes. Short sequences of system calls in running processes generate a sta ble signature for normal behavior. The signature is specific to each different kind of process, providin g clear separation between different kinds of programs. The signature has a high probability of being perturbed when abno rmal activities, such as attacks or attack attempts, occur. 4 A Sense of Self for Unix Processes RELATED WORK There are two basic approaches to intrusion detection: misuse i ntrusion detection and anomaly intrusion detection.
Levitt  focused on determining normal behavior for privileg ed processes, those that run as root. They define normal behavior using a program specification languag e, in which the allowed operations (system calls and their parameter s) of a process are formally specified. Authors use a much simpler representation of normal behavior. They rely on examples of normal runs (rather than formal specificati on of a programs expected behavior), and they ignore parameter val ues.  G. Fink and K. Levitt. Property-based testing of privileged programs. In Proceedings of the 10th Annual Computer Security Applications Conference (ACSAC), pages 154163, December 59 1994. 5
A Sense of Self for Unix Processes DEFINING SELF Program code stored on disk is unlikely to cause damage u ntil it runs. System damage is caused by running programs that execut e system calls. They consider only privileged processes. Root processes are more dangerous than user processes becaus e they have access to more parts of the computer system. They have a limited range of behavior, and their behavior is rel atively stable over time. Every program implicitly specifies a set of system call seq uences that it can produce. During normal execution, some subset of these sequences will be produced.
6 A Sense of Self for Unix Processes DEFINING SELF (CONTD) It is likely that any given execution of a program will pro duce a complete sequence of calls that has not been obse rved. The local (short range) ordering of system calls appears t o be remarkably consistent, and this suggests a simple de finition of self, or normal behavior. The authors define normal behavior in terms of short seq uences of system calls in a running process, currently seq uences of lengths 5, 6, and 11. The sequences of system calls form the set of normal pat terns for the database, and abnormal sequences indicate a nomalies in the running process. 7 A Sense of Self for Unix Processes
DEFINING SELF (CONTD) This definition of normal behavior ignores many aspects of process behavior, such as the parameter values passed to system calls, timing information, and instruction sequ ences between system calls. MIKE: the ignorance of such aspects may also affect the accu racy of static analysis. Certain intrusions might only be detectable by examinin g other aspects of a processs behavior, and so we might need to consider them later. The authors philosophy is to see how far they can go wit h the simple assumption. 8 A Sense of Self for Unix Processes AN EXAMPLE OF ATTACK
A buffer overflow attack can overwrite the execution inst ructions (in the memory) that was loaded by copying the binary code stored in the disk. After overflowed, the CPU will execute the binary code carried in the overflowed data. So that the process will produce the unexpected sequenc e of system calls that belongs to injected code rather than the original code. A (compromised) process is harmful only if it executes c ertain codes, especially system calls. However, there are still some attacks that can compromis e a process without executing it own codes. 9 A Sense of Self for Unix Processes DEFINING SELF DETAILS 1st stage They
scan traces of normal behavior and build up a database of characteristic normal patterns (observed sequences of syste m calls) Forks are traced individually, and their traces are included as part of normal. 2nd stage They scan new traces that might contain abnormal behavior, l ooking for patterns not present in the normal database. analysis of traces is performed off-line They slide a window of size k+1 across the trace of system calls and record which calls follow which within the sliding window. 10 A Sense of Self for Unix Processes
DEFINING SELF DETAILS (CONTD) Let k=3, and window = open, read, mmap, mmap, After all the elements are parsed and window is slide to the last element. 11 A Sense of Self for Unix Processes DEFINING SELF DETAILS (CONTD) Once we have the database of normal patterns, we check new traces against it using the same method. We slide a window of size k+1 across the new trace, deter mining if the sequence of system calls differs from that re corded in the normal database. Test test
case: open, read, mmap, open, open, getrlimit, mmap, clos e The test case length (L) is 8, the window size (k) is 3, and ther e are 4 mismatches in this case. The max # database size is 18. Miss rate = # of mismatch / the max # of mismatch for sequen ce of length L = 22%. 12 A Sense of Self for Unix Processes EXPERIMENTS They introduced a definition for normal behavior, based on short sequences of system calls. What size database do we need to capture normal behavior? What percentage of possible system call sequences is covered by the database of normal system call sequences?
Does our definition of normal behavior distinguish between d ifferent kinds of programs? Does our definition of normal detect anomalous behavior? They focus on sendmail although and report some data f or lpr. 13 A Sense of Self for Unix Processes BUILDING A NORMAL DATABASE Although the idea of collecting traces of normal behavior sounds simple, there are a number of decisions that must be made regarding how much and what kind of normal b ehavior is appropriate. generate an artificial set of test message? or monitor real user mail and hope that it covers the full spectrum of normal? The author then elected to use a suite of 112 artificially constr ucted messages, which included as many normal variations as
possible. These 112 messages produced a combined trace length of over 1.5 m illion system calls. 14 A Sense of Self for Unix Processes BUILDING A NORMAL DATABASE (CON TD) Test: count the number of mismatches between a new tra ce for testing and the normal database. Ideally, they would like these numbers to be zero for new exa mples of normal behavior, and for them to jump significantly when abnormalities occur. In a real system, a threshold value would need to be determin ed, below which a behavior is said to be normal, and above w hich it is deemed anomalous.
In MIKE: that is because the coverage of the normal databases is not en ough. this study, they simply report the numbers. 15 A Sense of Self for Unix Processes BUILDING A NORMAL DATABASE (CON TD) Q: the size of the normal database If the database is small then it defines a compact signature for the running process that would be practical to check in real-ti me while the process is active. Conversely, if the database is large then our approach will be too exp
ensive to use for on-line monitoring. The size of the normal database gives an estimate of how mu ch variability there is in the normal behavior of sendmail. This consideration is crucial because too much variability in normal would preclude detecting anomalies. In the worst case, if all possible sequences of length 6 show up as leg al normal behavior, then no anomalies could ever be detected. 16 A Sense of Self for Unix Processes BUILDING A NORMAL DATABASE (CON TD) Q: how much normal behavior should be sampled to provid e good coverage of the set of allowable sequences? Enumerate
potential sources of variation for normal sendmail o peration. Generate example mail messages that cause sendmail to exhibit these variations. Or Build a normal data base from the sequences produced by step 2. Continue generating normal mail messages, recording all misma tches and adding them to the normal database as they occur. With the number of messages in a sendmail run, we first se nt 1 message and traced sendmail then we sent 5 messages, tracing sendmail, and so forth, up to 20 messages. 17 A Sense of Self for Unix Processes BUILDING A NORMAL DATABASE (CON TD) 18
A Sense of Self for Unix Processes BUILDING A NORMAL DATABASE (CON TD) Q: What percentage of the total possible patterns (for seq uences of length 6) is covered by the normal database? For example, if the database is completely full (all possible pat terns have been recorded as normal) by 3000 system calls, the n it would hardly be surprising that no new patterns are seen o ver time. We estimate that the sendmail database described above cover s about 5*10-5% of the total possible patterns of system calls. an extremely small fraction Such calculation is somewhat misleading, because it is unlikel y that the sendmail program is capable of generating many of
these sequences. This would require a detailed analysis of the sendmail source code, an area of future investigation. 19 A Sense of Self for Unix Processes DISTINGUISHING BETWEEN PROC ESSES %: the percentage of abnormal sequences #: the number of abnormal sequences (produce sequences generated by other process, and they examine them against with the normal sendmail database) To determine how the behavior of sendmail compares with that of other processes, we tested several common processes against the normal sendmail database with 1500 entries. These processes have a significant number of abnormal sequences, 20 approximately, 532% for sequences of length 6, because the actions they perform are considerably different from those of sendmail. A Sense of Self for Unix Processes
ANOMALOUS BEHAVIOR Generated three types of behavior: traces of successful sendmail intrusions traces of sendmail intrusion attempts that failed sunsendmailcp, syslog, decode, lprcp sm565a, sm5x traces of error conditions forward loop 21
The % column indicates the percentage of abnormal sequences in one typical intrusion The # column indicates the number of abnormal sequences A Sense of Self for Unix Processes DISCUSSION These preliminary experiments suggest that short sequences of s ystem calls define a stable signature that can detect some comm on sources of anomalous behavior. easy to compute, relatively modest in storage requirements, on-li ne system Our approach is predicated on two important properties the sequence of system calls executed by a program is locally consiste nt during normal operation some unusual short sequences of system calls will be executed If a program enters an unusual error state during an attempted br eak-in, and if this error condition executes a sequence of system calls that is not already covered by our normal database
if code is replaced inside a running program by an intruder a successful intruder will need to fork a new process 22 A Sense of Self for Unix Processes DISCUSSION (CONTD) However, if an intrusion does not fit into either of these two categories, our method will almost certainly miss it under the current definition of normal. 1. For example, race condition attacks 2. an intruder using another users account Although the method we describe here will not provide a cry ptographically strong or completely reliable discriminator be tween normal and abnormal behavior,
it could potentially provide a lightweight, real-time tool for contin uously checking executing code based on frequency of execution. To achieve reliable discrimination, we need to ensure that our met hod of flagging sequences as abnormal does not produce too many false negatives or false positives. 23 A Sense of Self for Unix Processes DISCUSSION (CONTD) We currently monitor only the presence or absence of patterns, n ot their relative frequency. However, there are many other matching criteria that could be tried. E.g., we could represent the possible sequences of legal system calls a s a Markov chain and define a criterion based on expected frequencies .
The behavioral notion of identity goes well beyond a simple che cksum, login, password, or IP address, because it considers dyna mic patterns of activity rather than just static components. We are not yet using any partial or approximate matching, such as that used by the immune system, and we are not using on-line learning, as in the case of affinity maturation or negative selectio n. 24 A Sense of Self for Unix Processes CONCLUSIONS simple and practical defining self for privileged Unix processes, in terms of n ormal patterns of short sequences of system calls there are attacks that our method is not able to detect Comments Certain
issues are still unsolved what is self? how to obtain normal traces using dynamic model? why the model can distinguish normal and abnormal? the coverage problem: how do we know the collection of normal trac e are good enough? there has no judgment that how to decide the length of the system cal ls 25
SUPERLATIVE ADJECTIVES Boxing is interesting. Baseball is interesting Football is the most interesting sport in Turkey. Adjectives with one syllable (tek hece) Hot hotter the hottest Example :Antalya is a hot city.
What is Fit Testing. How well does the face-piece seal to the persons face? About determining if a specific respirator fits to a persons face . All respirators leak, the PortaCount machine measures how well a respirator fits onto a...
She is currently a junior majoring in Accounting with a minor in Mandarin Chinese language and culture. She currently works at the tutoring center as a Spanish tutor. Carol speaks Spanish, English, and she is in on the process of...
CP7203 PRINCIPLES OF PROGRAMMING LANGUAGES OBJECTIVES To understand and describe syntax and semantics of programming languages To understand data, data types, and basic statements To understand call-return architecture and ways of implementing them To understand object-orientation, concurrency, and event handling...
The exam is a generalist exam, covering all areas of HR. A CPA doesn't only have to know TAX accounting! There are over 93,000 professionals certified, and 20,000 more are taking the test each year. As an SHRM member, you...