Log in

View Full Version : Off topic and Off place:


naides
August 25th, 2005, 15:59
Theis post has nothing to do with RCE,
but I have a problem that is difficult for me, but for sure other people in here know the way around it as second nature.

The problem:

I have ~ 2000 text files of data that I need to analyze and extract information from them.

There are some web sites that offer online machines that do the analysis, then send you the results via web.

The problem is they only take one file at a time, so it becomes cumbersome and repetitive to do it 2000 times.

I am wandering if there is a way to automate the process, like a web browser with a script intepreter on the client side? I wiould need to read a file paste it into the form, send it read the answer save and loop
Is there such an animal?

Should I write a small perl program??
I have sniffed the format the data is transmitted and received but I am rusty in programming, I wiould have to reconstruct all the sockets dialog and I do not know if there are simpler solutions (I do not want to invent the wheel to drive two blocks)

Thank you

Woodmann
August 25th, 2005, 16:59
Howdy,

It depends on what you want to extract.
Words, phrases, images etc;

Woodmann

naides
August 25th, 2005, 18:49
Text. Plain text goes and plain text comes back. . .

Woodmann
August 25th, 2005, 20:09
OK,

I am still a little confused by your request.
If you just want to work with text files, I will get you the links to get the software required.

You use the word "sniff" so I wonder if you want a tool to sniff exact words from packets.

Woodmann

Woodmann
August 25th, 2005, 20:17
I cant remember the name of the program I have used but,
search for "text extractors" and such.

Woodmann

Fake51
August 26th, 2005, 03:39
Would be sort of interesting to have a look at those machines of which you talk. If you don't mind, you could post a link, maybe someone else might be able to think of something useful. And you could cure some curiosity at the same time.

Fake

Aimless
August 26th, 2005, 04:32
You might want to try:

1. iOpus Internet Macros (kinda buggy though...)

2. WinBatch (my favourite...)

Let us know how it goes...

Have Phun

andrewg
August 26th, 2005, 17:38
If you're not afraid of getting your hands dirty, try a quick scripting language -- python or perl, and use that to send / recieve the data.

http://www.python.org/doc/2.4.1/lib/urllib2-examples.html

naides
August 26th, 2005, 19:19
Thank you all, guys for your answers, in the public forum, and in the Private messages.
This forum is an incredible concentration of talent, and I am very happy and proud to be part of it.

no joke this time

thank you

0xf001
August 26th, 2005, 19:22
hi!

the thing is do you want to do it yourself or would you accept a ready made script to study ?

from my linux experiences that is exactly where you have 100 000 different possibilities with it. a little wget combined with sed, awk, or just perl, or better python - or ..... whatever
so once used to write little scripts (from scratch into commandline of course hahaha) you will never stop scripting honestly perl is quite easy to learn and can do a lot for you in this aspect - oltough i do not like it so much because it forces you to be lazy - ahem ... ok ..

i did not get the point what is your plan. for what you need text file? to parse sthg what needs to be pasted into the form? is it maybe you want to automate loading up 2000 files or ......?

if you are more specific or can just describe again - i am sure i can help u!

[edit] - i read again - is it you don't _need_ a web server at all? task is to extract information? so: perl and regexp should do the job. just a question of the regexp.
regular expressions are really the magic in text processing

contace me, i need the challenge

cheers, 0xf001

Peres
August 27th, 2005, 06:30
Nobody here mentioned using a lexer/parser combination. Of course, you need the grammar for your files in this case.

Peres