Extract The Data From Content Of Html

May 25, 2024 Post a Comment

I'm trying to extract data from HTML. I did it with curl, but all I need is to pass the title to another variable:

Solution 1:

You should use a parser to pull values out of HTML files/strings/docs. Here's an example using the domdocument.

$string = '<meta  property="og:url" content="https://example.com/">';
$doc = new DOMDocument();
$doc->loadHTML($string);
$metas = $doc->getElementsByTagName('meta');
foreach($metasas$meta) {
    if($meta->getAttribute('property') == 'og:url') {
        echo$meta->getAttribute('content');
    }
}

Output:

https://example.com/

Solution 2:

If you are loading the HTML from a remote location and not a local string you can use DOM for this using something like:

libxml_use_internal_errors(TRUE);
$dom = new DOMDocument;
$dom->loadHTMLFile('https://evernote.com');
libxml_clear_errors();
$xp = new DOMXpath($dom);
$nodes = $xp->query('//meta[@property="og:url"]');
if(!is_null($nodes->item(0)->attributes)) {
    foreach ($nodes->item(0)->attributes as$attr) {
        if($attr->value!="og:url") {
            print$attr->value; 
        }
    }
}

This outputs the expected value:

https://evernote.com/

Html5 Playground

Extract The Data From Content Of Html

Solution 1:

Solution 2:

Post a Comment for "Extract The Data From Content Of Html"