Skip to content

Starting ProxyStore Endpoint on Polaris at ALCF #663

@Zilinghan

Description

@Zilinghan

Hi, I am trying to start a proxystor-endpoint on ALCF's Polaris. When I started my endpoint, I get the following lines of errors in the log.txt file:

[2025-01-29 20:27:32.141] INFO  (proxystore.p2p.nat) :: Checking NAT type. This may take a moment...
[2025-01-29 20:27:32.141] ERROR (proxystore.p2p.nat) :: Failed to determine NAT type: [Errno 101] Network is unreachable
[2025-01-29 20:27:32.143] ERROR (proxystore.endpoint.serve) :: Caught unhandled exception: OSError(101, 'Network is unreachable')
Traceback (most recent call last):
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/proxystore/endpoint/serve.py", line 230, in serve
    asyncio.run(_serve_async(config))
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/proxystore/endpoint/serve.py", line 151, in _serve_async
    endpoint = await Endpoint(
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/proxystore/endpoint/endpoint.py", line 218, in __aenter__
    await self.async_init()
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/proxystore/endpoint/endpoint.py", line 243, in async_init
    await self._peer_manager.async_init()
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/proxystore/p2p/manager.py", line 139, in async_init
    await self._relay_client.connect()
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/proxystore/p2p/relay/client.py", line 293, in connect
    self._websocket = await self._register(
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/proxystore/p2p/relay/client.py", line 179, in _register
    websocket = await websockets.client.connect(
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/websockets/legacy/client.py", line 650, in __await_impl_timeout__
    return await asyncio.wait_for(self.__await_impl__(), self.open_timeout)
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/asyncio/tasks.py", line 445, in wait_for
    return fut.result()
  File "/eagle/tpc/zilinghan/conda_envs/appfl/lib/python3.10/site-packages/websockets/legacy/client.py", line 654, in __await_impl__
    transport, protocol = await self._create_connection()
  File "uvloop/loop.pyx", line 2043, in create_connection
  File "uvloop/loop.pyx", line 2019, in uvloop.loop.Loop.create_connection
  File "uvloop/handles/tcp.pyx", line 182, in uvloop.loop.TCPTransport.connect
  File "uvloop/handles/tcp.pyx", line 204, in uvloop.loop._TCPConnectRequest.connect
OSError: [Errno 101] Network is unreachable

I ran the following lines before starting the endpoint to access the proxy host as specified in ALCF's documentation. I wonder if there is anything else I need to do to start the endpoint, thanks!

# proxy settings
export HTTP_PROXY="http://proxy.alcf.anl.gov:3128"
export HTTPS_PROXY="http://proxy.alcf.anl.gov:3128"
export http_proxy="http://proxy.alcf.anl.gov:3128"
export https_proxy="http://proxy.alcf.anl.gov:3128"
export ftp_proxy="http://proxy.alcf.anl.gov:3128"
export no_proxy="admin,polaris-adminvm-01,localhost,*.cm.polaris.alcf.anl.gov,polaris-*,*.polaris.alcf.anl.gov,*.alcf.anl.gov"

My config.toml looks like this:

name = "my-endpoint"
uuid = "b6cfb02b-323f-4eac-8c42-20102bb0bd26"
port = 8765
host = "10.201.0.56"

[relay]
address = "wss://relay.proxystore.dev"
peer_channels = 1
verify_certificate = true

[relay.auth]
method = "globus"

[relay.auth.kwargs]

[storage]
max_object_size = 100000000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions