Formatting HTML string to plain string

Hello everyone,
I’m making a game which makes use of some HTTP APIs, but I’ve run into a problem with converting some of the returned data, namely some values which contain HTML elements in their string.

I’ll give an example as to what I mean:

local msg = '<p>This is a demonstration &amp; a <a href=\"https://www.google.com">hyperlink example</a>  </p>'

My ideal result of converting this would be "This is a demonstration & a hyperlink example." (without the link in question)

So, basically, anyone know of the best way to get rid of the HTML stuff from a string and just keep the plain text? Any help with this would be greatly appreciated :slightly_smiling_face:

Hey, thanks for your timely response,

I already have the data, all I need is to convert it to plaintext so it can be shown to a user, I don’t believe that module does what I’m looking for - Not from the description/examples at least :slightly_smiling_face:

Excuse me, at the time I didn’t check the module at the time.

THIS should be what you’re looking for.

Example code:

local html = [[<p id="test">Hi &amp; test </p>]]

local HTML = require(script.html)

print(HTML.parse(html):select("#test")[1]:content())

Although I will warn you, it does not parse entities (&amp;)

This looks promising! :slightly_smiling_face: However my html text does not come with an ID, it is just a string with HTML tags, I’ve tried doing some playing around to try and get it to work with what I have but can’t seem to get it to, any ideas?

Since the HTML-text is returned from a value in a table its just a plain string: <p>Example text and <a href=\"https://www.google.com">example hyperlink</a> </p>
As such, it does not have an ID associated with it, so I’m not sure how to make this particular scenario work with that. :slight_smile:

Sorry that I’ve not responded in a while.

Anyways, it looks like it isn’t really even valid HTML. So the library will be useless. But, it might help you in other parts of what you’re doing.

Anyways, try this

print(str:gsub("<[^>]*>",""))

But once again it will strip it of all formatting and &amp;s will not be replaced.