Many modern websites are dynamically generated/loaded. This means that if you want to download them / scrape them, or use them with an LLM/MCP server, you need to download them with a full web browser. You can't just curl the data and user that HTML. You'll miss all of the content.
So, to do this, you can use this Dockerfile and script to download any website and convert the page to markdown.