2021-09-01

When Tensorflow cannot match the checkpoint file ERROR:tensorflow:Couldn't match files for checkpoint

It seems that Tensorflow hardcodes the path to a checkpoint file using an absolute path. So if you move the checkpoints from one computer to another, you may run into errors like below: 

ERROR:tensorflow:Couldn't match files for checkpoint /root/result_base/cnn_dailymail/sent_delete/model.ckpt-273863

To fix it, it's very simple. Just open the checkpoint file in the folder that stores Tensorflow result from an experiment, and replace the paths to checkpoints with the ones fit for the new computer.

For example, from

model_checkpoint_path: "/root/result_base/cnn_dailymail/sent_delete/model.ckpt-273863"
all_model_checkpoint_paths: "/root/result_base/cnn_dailymail/sent_delete/model.ckpt-273863"

to

model_checkpoint_path: "/mnt/12T/data/NLP/result_base/cnn_dailymail/sent_delete/model.ckpt-273863"
all_model_checkpoint_paths: "/mnt/12T/data/NLP/result_base/cnn_dailymail/sent_delete/model.ckpt-273863"

No comments: