Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Jul 2021 23:17:32 +0300
From:      Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>
To:        Vlad Markov <dvoich@optonline.net>
Cc:        FreeBSD Questions Mailing List <freebsd-questions@freebsd.org>
Subject:   Re: Analyzing Log files of very large size
Message-ID:  <CAOgwaMtBJG76McuLkY5M0xfGvz6_hgSpY%2BuVcsa7t6rpmGZ0cw@mail.gmail.com>
In-Reply-To: <20210711103839.61dfd4baafa38984f208b707@optonline.net>
References:  <CAKgGyB_TJrLWSjcnc9491Gg0Q5CLqLdmWx2yga_Ez7-gE6YcKQ@mail.gmail.com> <E9C00664-DAC7-4F58-BCCA-CDD2654C9325@febras.net> <CAKgGyB_reF4eqz4pvQj7tFsOQEEB3WrFZa-91L%2BNChm=85h0-A@mail.gmail.com> <20210711103839.61dfd4baafa38984f208b707@optonline.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jul 11, 2021 at 5:38 PM Vlad Markov <dvoich@optonline.net> wrote:

> On Sun, 11 Jul 2021 19:43:41 +0530
> KK CHN <kkchn.in@gmail.com> wrote:
>
> > Yes, it is.
> >
> > On Sun, Jul 11, 2021 at 6:02 PM Korolev Sergey <serejk@febras.net>
> wrote:
> >
> > > Is it a plain text file?
> > >
> > > On 11 Jul 2021, at 22:13, KK CHN <kkchn.in@gmail.com> wrote:
> > >
> > > List,
> > >
> > > I am in a requirement to analyze large log files of sonic wall firewall
> > > around 50 GB. for a suspect attack.
> > >
> > > What tools and solutions need to be deployed for handling this much
> large
> > > files and pls enlighten me with your expertise and reference materials
> if
> > > any.
> > >
> > > All are tcp / ip communications, DNS UDP transports ..
> > >
> > > Regards,
> > > Kris
> I used to use split to break up large log files into manageable pieces.
> From there it depends on how you work. At first we used grep then we moved
> on to using perl regex to analyze logs.
>
> Vlad
>
>
>
> --
>
>
>

My idea is as follows because I am trying to use such a feature for a
database management system to track behavior of the program .
The generated log for a very short time came out 56 GigaBytes . During
backup of sources , the computer warned me about
"You are trying to backup 56 GigaBytes into a 4.7 GigaBytes DVD."
Assume a message line is 56 bytes , this size of file contains 1 Billion
records to study .

Then , it is easy to load this size of file as an AVL tree into memory by
grouping the accessed parts by counting their occurrences .

In your case , you may generate your log as , perhaps "accessor , accessed
parts , ... " .
Assume that you need who is accessing ( or attempting to access ) into
'some (as list )" parts .
During AVL tree generation , use "accessed parts" as  KEYs , and "accessor"
values as its leaves with some other vital information .


>From an AVL tree it is very easy to get a list of such accessors in order
and study them in more detail .
Since a small amount of information is sufficient , computers with memory
capacities will be sufficient .
If your memory is not sufficient , you may use an SSD disk as a storage
with even 500 Mega~Bytes per second write/read speeds .
Be careful about wear of such disks with very high amounts of write/read
operations .



It is very easy to find open source AVL tree software with sufficiently
permissive licenses . I do not know exactly , but my
opinion is that even in FreeBSD sources there are such parts .

It is possible to find information about AVL trees in data structures books
, especially such books using C or C++ may be more
useful for you .


https://en.wikipedia.org/wiki/AVL_tree
AVL tree

Please search the following phrase in Google :

open source repositories about avl software



Mehmet Erol Sanliturk



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOgwaMtBJG76McuLkY5M0xfGvz6_hgSpY%2BuVcsa7t6rpmGZ0cw>