Date: Sun, 11 Jul 2021 23:17:32 +0300 From: Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com> To: Vlad Markov <dvoich@optonline.net> Cc: FreeBSD Questions Mailing List <freebsd-questions@freebsd.org> Subject: Re: Analyzing Log files of very large size Message-ID: <CAOgwaMtBJG76McuLkY5M0xfGvz6_hgSpY%2BuVcsa7t6rpmGZ0cw@mail.gmail.com> In-Reply-To: <20210711103839.61dfd4baafa38984f208b707@optonline.net> References: <CAKgGyB_TJrLWSjcnc9491Gg0Q5CLqLdmWx2yga_Ez7-gE6YcKQ@mail.gmail.com> <E9C00664-DAC7-4F58-BCCA-CDD2654C9325@febras.net> <CAKgGyB_reF4eqz4pvQj7tFsOQEEB3WrFZa-91L%2BNChm=85h0-A@mail.gmail.com> <20210711103839.61dfd4baafa38984f208b707@optonline.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jul 11, 2021 at 5:38 PM Vlad Markov <dvoich@optonline.net> wrote: > On Sun, 11 Jul 2021 19:43:41 +0530 > KK CHN <kkchn.in@gmail.com> wrote: > > > Yes, it is. > > > > On Sun, Jul 11, 2021 at 6:02 PM Korolev Sergey <serejk@febras.net> > wrote: > > > > > Is it a plain text file? > > > > > > On 11 Jul 2021, at 22:13, KK CHN <kkchn.in@gmail.com> wrote: > > > > > > List, > > > > > > I am in a requirement to analyze large log files of sonic wall firewall > > > around 50 GB. for a suspect attack. > > > > > > What tools and solutions need to be deployed for handling this much > large > > > files and pls enlighten me with your expertise and reference materials > if > > > any. > > > > > > All are tcp / ip communications, DNS UDP transports .. > > > > > > Regards, > > > Kris > I used to use split to break up large log files into manageable pieces. > From there it depends on how you work. At first we used grep then we moved > on to using perl regex to analyze logs. > > Vlad > > > > -- > > > My idea is as follows because I am trying to use such a feature for a database management system to track behavior of the program . The generated log for a very short time came out 56 GigaBytes . During backup of sources , the computer warned me about "You are trying to backup 56 GigaBytes into a 4.7 GigaBytes DVD." Assume a message line is 56 bytes , this size of file contains 1 Billion records to study . Then , it is easy to load this size of file as an AVL tree into memory by grouping the accessed parts by counting their occurrences . In your case , you may generate your log as , perhaps "accessor , accessed parts , ... " . Assume that you need who is accessing ( or attempting to access ) into 'some (as list )" parts . During AVL tree generation , use "accessed parts" as KEYs , and "accessor" values as its leaves with some other vital information . >From an AVL tree it is very easy to get a list of such accessors in order and study them in more detail . Since a small amount of information is sufficient , computers with memory capacities will be sufficient . If your memory is not sufficient , you may use an SSD disk as a storage with even 500 Mega~Bytes per second write/read speeds . Be careful about wear of such disks with very high amounts of write/read operations . It is very easy to find open source AVL tree software with sufficiently permissive licenses . I do not know exactly , but my opinion is that even in FreeBSD sources there are such parts . It is possible to find information about AVL trees in data structures books , especially such books using C or C++ may be more useful for you . https://en.wikipedia.org/wiki/AVL_tree AVL tree Please search the following phrase in Google : open source repositories about avl software Mehmet Erol Sanliturk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOgwaMtBJG76McuLkY5M0xfGvz6_hgSpY%2BuVcsa7t6rpmGZ0cw>