How do I troubleshoot issues when bringing my custom container to Amazon SageMaker Studio?

2 minute read
0

I want to troubleshoot issues when building a custom Amazon SageMaker Studio.

Resolution

A SageMaker image is a file that identifies kernels, language packages, and other dependencies that are required to run a Jupyter notebook in SageMaker Studio. These images are used to create an environment from where you can run your Jupyter notebooks. If the built-in images provided by SageMaker don't support your use case, you can bring a custom image to use in SageMaker Studio.

You might get errors when using your customer image in SageMaker Studio. These errors might result mostly from an incorrect configuration that you set up during the container image build process. Therefore, be sure that the custom image is compatible with SageMaker Studio.

To do so, check the following when you build the Dockerfile:

  • You have set the DefaultUID and DefaultGID values accurately. SageMaker Studio only supports a specific combination of DefaultUID and DefaultGID. Be sure that DefaultUID and DefaultGID are set to 1000 and 100, respectively for a non-privileged user. Be sure that the values are both set to 0 for the root user.
  • The 'opt/.sagemakerinternal' and 'opt/ml' directories aren't used. These directories are reserved.
  • When you create the configuration file app-image-config-input.json, be sure that the Name value for KernelSpec in this file matches the Name value for the associated image.

Example Dockerfile that installs Python packages and sets the scope to non-privileged users:

FROM public.ecr.aws/amazonlinux/amazonlinux:2

ARG NB_USER="sagemaker-user"
ARG NB_UID="1000"
ARG NB_GID="100"

RUN \
    yum install --assumeyes python3 shadow-utils && \
    useradd --create-home --shell /bin/bash --gid "${NB_GID}" --uid ${NB_UID} ${NB_USER} && \
    yum clean all && \
    python3 -m pip install ipykernel && \
    python3 -m ipykernel install

USER ${NB_UID}

You can view the error messages in Amazon CloudWatch logs when either of the following happens:

  • Image version creation fails.
  • You get an error when launching the image in SageMaker Studio.

You can find these messages in the log group /aws/sagemaker/studio and log stream $domainID/$userProfileName/KernelGateway/$appName.

Note: This article assumes that the AWS Identity and Access Management (IAM) user or role has the AmazonSageMakerFullAccess policy attached to it.


Related information

Custom SageMaker image specifications

AWS OFFICIAL
AWS OFFICIALUpdated a year ago