GJS : How can I use Regex ( parse HTML ) ?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

GJS : How can I use Regex ( parse HTML ) ?

Τάσος Λισγάρας
Hello,

I download a page and I want to parse it so I can get specific data from
it. Unfortunately, I didn't find any ready-made library ( in GJS ) for
HTML parsing, so I turn to regular expressions.

Because the documentation doesn't help me at all, can you please tell
me, how I can use the regular expressions in GJS ?

( my code is not running because I don't use the GJS regex library
properly )

Thanks in advance for your time.
Anastasios.

=?UTF-8?b?Z2V0QW5ub3VuY2VtZW50cw==?= (776 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: GJS : How can I use Regex ( parse HTML ) ?

Tony Houghton
Gjs has standard javascript regexps built in, so you should be able to get all the info you need from MDN here and here.

Alternatively, nearly everything in GLIb is available to gjs, so you should be able to use its simple XML parser.

On Sat, 22 Jun 2019 at 06:30, Τάσος Λισγάρας via javascript-list <[hidden email]> wrote:
Hello,

I download a page and I want to parse it so I can get specific data from
it. Unfortunately, I didn't find any ready-made library ( in GJS ) for
HTML parsing, so I turn to regular expressions.

Because the documentation doesn't help me at all, can you please tell
me, how I can use the regular expressions in GJS ?

( my code is not running because I don't use the GJS regex library
properly )

Thanks in advance for your time.
Anastasios.
_______________________________________________
javascript-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/javascript-list


--
TH

Reply | Threaded
Open this post in threaded view
|

Re: GJS : How can I use Regex ( parse HTML ) ?

Emmanuele Bassi
In reply to this post by Τάσος Λισγάρας

On Sat, 22 Jun 2019 at 06:30, Τάσος Λισγάρας via javascript-list <[hidden email]> wrote:
Hello,

I download a page and I want to parse it so I can get specific data from
it. Unfortunately, I didn't find any ready-made library ( in GJS ) for
HTML parsing, so I turn to regular expressions.

Because the documentation doesn't help me at all, can you please tell
me, how I can use the regular expressions in GJS ?

( my code is not running because I don't use the GJS regex library
properly )

Thanks in advance for your time.
Anastasios.
_______________________________________________
javascript-list mailing list
[hidden email]
https://mail.gnome.org/mailman/listinfo/javascript-list


--
Reply | Threaded
Open this post in threaded view
|

Re: GJS : How can I use Regex ( parse HTML ) ?

Τάσος Λισγάρας
Unfortunately, because I don't have autocomplete, I have been struggling
with the correct/permissible use of the "match" or "matchAll" function.
I repeatedly made the following mistake :
"Javascript JS ERROR: TypeError: mystr.matchAll is not a function"

Eventually I used the global RegExp from JavaScript  with this bad code:

let tableOfAnnouncementsHTML = announcementsHTML.match(/<table
class=\"table announcements-table\">(.*)\.(.*)<\/table>/);
var announcements = [];

tableOfAnnouncementsHTML[0].replace(/[^<]*(<a
href="([^"]+)">([^<]+)<\/a>)/g, function ()
{
  announcements.push(Array.prototype.slice.call(arguments, 1, 4));
});


It is compatible with GJS and GNOME Shell to import modules from the npm
registry ?
On the other hand, I don't want to use code that is just open and not
free (I recently read an article that deals with npm registry issue ).
But mostly I thought it can not be done, and I did not want to add
complexity with (unnecessary) dependencies.
Moreover, there is no implementation for regex in
"imports.gi.GLib.Regex" ? Can not I use this?

Tony Houghton,
The site is written in HTML only, so I guess the Glib XML parser will
not work. Right?
Also, I don't find the documentation for XML parser in GJS, so I haven't
 managed to find how to write it in my code.

Emmanuele Bassi,
I know this "rule", but what else can I do?
Finally, as a last resort, I managed to implement it with the poor
implementation of regular expressions.

Thank you all!
Kind regards,
Anastasios Lisgaras

On 6/22/19 5:38 PM, Emmanuele Bassi wrote:

> You cannot parse HTML with regular expressions:
>
> https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
>
> Ciao,
> ??Emmanuele.
>
>
> On Sat, 22 Jun 2019 at 06:30, ?????????? ???????????????? via javascript-list
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hello,
>
>     I download a page and I want to parse it so I can get specific data
>     from
>     it. Unfortunately, I didn't find any ready-made library ( in GJS ) for
>     HTML parsing, so I turn to regular expressions.
>
>     Because the documentation doesn't help me at all, can you please tell
>     me, how I can use the regular expressions in GJS ?
>
>     ( my code is not running because I don't use the GJS regex library
>     properly )
>
>     Thanks in advance for your time.
>     Anastasios.
>     _______________________________________________
>     javascript-list mailing list
>     [hidden email] <mailto:[hidden email]>
>     https://mail.gnome.org/mailman/listinfo/javascript-list
>
>
>
> --
> https://www.bassi.io
> [@] ebassi [@gmail.com <http://gmail.com>]

--
Kind regards,
Anastasios Lisgaras
Open Source Software Engineer.
Cell phone : +30 69 55 83 17 92
GPG Fingerprint: 5003 03E8 CA50 1878 06D9  3AEA FC25 8330 FE34 8E41