有什么类似DOMParser可以提取文本中的URL，但是提取URL时不进行标准化操作的？用JS原生的对象-Go服务器开发

如果你需要从文本中提取 原始未标准化的 URL，并且希望使用 原生 JavaScript 方法（不依赖 DOMParser），可以考虑以下方案：

1. 纯正则表达式提取（推荐）

直接使用正则表达式匹配 HTML/文本中的 URL，避免任何 DOM 解析带来的标准化：

const text = `
    <a href="./test?x=1&y=2">Link</a>
    <img src="http://example.com/path with spaces.jpg">
`;

// 匹配 href/src/content 等属性中的 URL
const urlRegex = /(?:href|src|url)=["']([^"']+)["']/gi;
const urls = [...text.matchAll(urlRegex)].map(match => match[1]);

console.log(urls); 
// ["./test?x=1&y=2", "http://example.com/path with spaces.jpg"]

✅ 优点：

完全原生 JS，无任何标准化处理。
保留原始字符串（包括空格、特殊字符）。

❌ 缺点：

需要手动处理不同属性（如 href、src、data-url）。
复杂嵌套结构可能需要更复杂的正则。

2. `template` + `innerHTML`（部分浏览器有效）

利用 <template> 标签解析 HTML，但某些浏览器可能仍会轻微标准化：

const html = `<a href="./test?x=1&y=2">Link</a>`;
const template = document.createElement('template');
template.innerHTML = html; // 解析但不渲染

// 获取未标准化的 href
const link = template.content.querySelector('a');
console.log(link.getAttribute('href')); // "./test?x=1&y=2"

✅ 优点：

接近原生 DOM API。

❌ 缺点：

innerHTML 可能在某些浏览器中对 URL 编码空格等字符（如转成 %20）。
<template> 是浏览器 API，非 Node.js 环境不可用。

3. `Range.createContextualFragment()`

类似 <template>，但直接生成文档片段：

const html = `<img src="http://example.com/a b.jpg">`;
const range = document.createRange();
const fragment = range.createContextualFragment(html);

// 提取未标准化的 src
const img = fragment.querySelector('img');
console.log(img.getAttribute('src')); // "http://example.com/a b.jpg"

⚠️ 注意：和 <template> 一样，某些浏览器可能会对空格进行编码。

4. XMLHttpRequest / Fetch + Response.text()

如果是从网络请求获取的 HTML，可直接操作原始文本：

fetch('some-page.html')
    .then(response => response.text())
    .then(html => {
        const urls = html.match(/src=["']([^"']+)["']/gi);
        console.log(urls); // ["src='./raw?q=1&p=2'"]
    });

✅ 100% 原始字符串，无任何修改。

终极方案：自定义轻量级解析器

如果以上方法仍无法满足需求（如需要精准的 AST），可考虑用纯 JS 实现一个极简的 HTML Tokenizer：

function extractRawUrls(html) {
    const urls = [];
    let pos = 0;

    while (pos < html.length) {
        const attrStart = html.indexOf('="', pos);
        if (attrStart === -1) break;

        const attrEnd = html.indexOf('"', attrStart + 2);
        if (attrEnd === -1) break;

        const attrValue = html.slice(attrStart + 2, attrEnd);
        if (['href', 'src', 'data-url'].some(attr => 
            html.slice(pos, attrStart).endsWith(attr)
        )) {
            urls.push(attrValue);
        }

        pos = attrEnd + 1;
    }

    return urls;
}

console.log(extractRawUrls(`<a href="/path?a=1&b=2">`)); // ["/path?a=1&b=2"]

总结对比

方法	是否原生 JS	是否保留原样	适用场景
正则匹配	✅ Yes	✅ Yes	简单快速提取
`<template>`	✅ Yes	⚠️ Partial*	需解析 DOM但不渲染
`Range.createContextualFragment()`	✅ Yes	⚠️ Partial*	动态生成片段
Fetch + Text	✅ Yes	✅ Yes	远程 HTML
Custom Parser	✅ Yes	✅ Yes	完全控制解析逻辑