Created
September 8, 2019 04:06
-
-
Save jlebar/fcb502278cc03019238e677dc6c25896 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| WARNING: Logging before flag parsing goes to stderr. | |
| W0907 21:06:26.900039 139893622470464 lazy_loader.py:50] | |
| The TensorFlow contrib module will not be included in TensorFlow 2.0. | |
| For more information, please see: | |
| * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md | |
| * https://github.com/tensorflow/addons | |
| * https://github.com/tensorflow/io (for I/O related ops) | |
| If you depend on functionality not listed there, please file an issue. | |
| W0907 21:06:28.349731 139893622470464 deprecation_wrapper.py:119] From /usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/utils/expert_utils.py:68: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead. | |
| W0907 21:06:29.789397 139893622470464 deprecation_wrapper.py:119] From /usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/jaxboard.py:38: The name tf.HistogramProto is deprecated. Please use tf.compat.v1.HistogramProto instead. | |
| W0907 21:06:29.789571 139893622470464 deprecation_wrapper.py:119] From /usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/jaxboard.py:39: The name tf.Summary is deprecated. Please use tf.compat.v1.Summary instead. | |
| W0907 21:06:29.789673 139893622470464 deprecation_wrapper.py:119] From /usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/jaxboard.py:40: The name tf.SummaryMetadata is deprecated. Please use tf.compat.v1.SummaryMetadata instead. | |
| W0907 21:06:29.800125 139893622470464 deprecation_wrapper.py:119] From /usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/trainer.py:99: The name tf.enable_eager_execution is deprecated. Please use tf.compat.v1.enable_eager_execution instead. | |
| 2019-09-07 21:06:29.865873: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices... | |
| I0907 21:06:29.866058 139893622470464 trax.py:139] Using --output_dir /usr/local/google/tmp/trax | |
| Using --output_dir /usr/local/google/tmp/trax | |
| I0907 21:06:30.378094 139893622470464 dataset_builder.py:184] Overwrite dataset info from restored data version. | |
| I0907 21:06:30.382681 139893622470464 dataset_builder.py:184] Overwrite dataset info from restored data version. | |
| I0907 21:06:30.384521 139893622470464 dataset_builder.py:253] Reusing dataset mnist (/usr/local/google/tmp/trax/mnist/1.0.0) | |
| I0907 21:06:30.384707 139893622470464 dataset_builder.py:399] Constructing tf.data.Dataset for split train, from /usr/local/google/tmp/trax/mnist/1.0.0 | |
| W0907 21:06:30.437631 139893622470464 deprecation.py:323] From /usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/tensorflow/python/data/util/random_seed.py:58: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. | |
| Instructions for updating: | |
| Use tf.where in 2.0, which has the same broadcast rule as np.where | |
| 2019-09-07 21:06:30.478029: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: __inference_Dataset_flat_map_read_one_file_25 | |
| 2019-09-07 21:06:30.478126: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: __inference_Dataset_map_ExampleParser.parse_example_39 | |
| 2019-09-07 21:06:30.537407: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_true_89 | |
| 2019-09-07 21:06:30.537503: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_false_90 | |
| 2019-09-07 21:06:30.543452: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_true_78 | |
| 2019-09-07 21:06:30.543508: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_false_79 | |
| 2019-09-07 21:06:30.543560: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_true_89 | |
| 2019-09-07 21:06:30.543624: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_false_90 | |
| 2019-09-07 21:06:30.550423: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: decode_image_cond_jpeg_true_59 | |
| 2019-09-07 21:06:30.550512: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: decode_image_cond_jpeg_false_60 | |
| 2019-09-07 21:06:30.550574: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_true_78 | |
| 2019-09-07 21:06:30.550607: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_false_79 | |
| 2019-09-07 21:06:30.550649: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_true_89 | |
| 2019-09-07 21:06:30.550708: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_false_90 | |
| I0907 21:06:30.553653 139893622470464 dataset_builder.py:184] Overwrite dataset info from restored data version. | |
| I0907 21:06:30.556012 139893622470464 dataset_builder.py:253] Reusing dataset mnist (/usr/local/google/tmp/trax/mnist/1.0.0) | |
| I0907 21:06:30.556189 139893622470464 dataset_builder.py:399] Constructing tf.data.Dataset for split test, from /usr/local/google/tmp/trax/mnist/1.0.0 | |
| 2019-09-07 21:06:30.579470: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: __inference_Dataset_flat_map_read_one_file_154 | |
| 2019-09-07 21:06:30.579543: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: __inference_Dataset_map_ExampleParser.parse_example_168 | |
| 2019-09-07 21:06:30.635790: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_true_218 | |
| 2019-09-07 21:06:30.635885: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_false_219 | |
| 2019-09-07 21:06:30.641705: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_true_207 | |
| 2019-09-07 21:06:30.641761: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_false_208 | |
| 2019-09-07 21:06:30.641813: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_true_218 | |
| 2019-09-07 21:06:30.641878: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_false_219 | |
| 2019-09-07 21:06:30.648735: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: decode_image_cond_jpeg_true_188 | |
| 2019-09-07 21:06:30.648824: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: decode_image_cond_jpeg_false_189 | |
| 2019-09-07 21:06:30.648884: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_true_207 | |
| 2019-09-07 21:06:30.648917: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_png_false_208 | |
| 2019-09-07 21:06:30.648959: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_true_218 | |
| 2019-09-07 21:06:30.649018: W tensorflow/core/common_runtime/eager/context.cc:371] Added two functions with the same name: cond_gif_false_219 | |
| W0907 21:06:30.658030 139893622470464 deprecation_wrapper.py:119] From /usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/inputs.py:332: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead. | |
| I0907 21:06:30.658161 139893622470464 inputs.py:333] Heuristically setting bucketing to False based on shapes of target tensors. | |
| I0907 21:06:30.668358 139893622470464 inputs.py:333] Heuristically setting bucketing to False based on shapes of target tensors. | |
| I0907 21:06:30.675841 139893622470464 inputs.py:333] Heuristically setting bucketing to False based on shapes of target tensors. | |
| 2019-09-07 21:06:30.679798: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:30.680048: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:30.680270: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:30.680468: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:30.680650: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:30.681040: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:30.681240: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:31.035813: E external/org_tensorflow/tensorflow/stream_executor/stream.cc:332] Error recording event in stream: error recording CUDA event on stream 0x73329b0: CUDA_ERROR_INVALID_HANDLE: invalid resource handle; not marking stream as bad, as the Event object may be at fault. Monitor for further errors. | |
| 2019-09-07 21:06:31.201710: E external/org_tensorflow/tensorflow/compiler/xla/python/local_client.cc:731] Execution of replica 0 failed: Internal: failed to load in-memory CUBIN: CUDA_ERROR_INVALID_SOURCE: device kernel image is invalid | |
| Traceback (most recent call last): | |
| File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main | |
| "__main__", mod_spec) | |
| File "/usr/lib/python3.6/runpy.py", line 85, in _run_code | |
| exec(code, run_globals) | |
| File "/usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/trainer.py", line 133, in <module> | |
| app.run(main) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/absl/app.py", line 300, in run | |
| _run_main(main, args) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main | |
| sys.exit(main(argv)) | |
| File "/usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/trainer.py", line 129, in main | |
| trax.train(output_dir=output_dir) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/gin/config.py", line 1073, in gin_wrapper | |
| utils.augment_exception_message_and_reraise(e, err_str) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/gin/utils.py", line 49, in augment_exception_message_and_reraise | |
| six.raise_from(proxy.with_traceback(exception.__traceback__), None) | |
| File "<string>", line 3, in raise_from | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/gin/config.py", line 1050, in gin_wrapper | |
| return fn(*new_args, **new_kwargs) | |
| File "/usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/trax.py", line 878, in train | |
| save_steps=save_steps, has_weights=has_weights) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/gin/config.py", line 1073, in gin_wrapper | |
| utils.augment_exception_message_and_reraise(e, err_str) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/gin/utils.py", line 49, in augment_exception_message_and_reraise | |
| six.raise_from(proxy.with_traceback(exception.__traceback__), None) | |
| File "<string>", line 3, in raise_from | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/gin/config.py", line 1050, in gin_wrapper | |
| return fn(*new_args, **new_kwargs) | |
| File "/usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/trax.py", line 551, in __init__ | |
| rng, init_rng = jax_random.split(rng) | |
| File "/usr/local/google/home/jlebar/code/tensor2tensor/tensor2tensor/trax/backend.py", line 256, in split | |
| return backend()["random_split"](prng, num) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/jax/random.py", line 193, in split | |
| return _split(key, num) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/jax/api.py", line 147, in f_jitted | |
| out = xla.xla_call(flat_fun, *args_flat, device_assignment=device_assignment, backend=backend) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/jax/core.py", line 569, in call_bind | |
| outs = primitive.impl(f, *args, **params) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/jax/interpreters/xla.py", line 368, in _xla_call_impl | |
| return compiled_fun(*args) | |
| File "/usr/local/google/home/jlebar/.local/lib/python3.6/site-packages/jax/interpreters/xla.py", line 402, in _execute_compiled | |
| out_bufs = compiled.Execute(input_bufs).destructure() | |
| RuntimeError: Internal: failed to load in-memory CUBIN: CUDA_ERROR_INVALID_SOURCE: device kernel image is invalid | |
| In call to configurable 'Trainer' (<class 'tensor2tensor.trax.trax.Trainer'>) | |
| In call to configurable 'train' (<function train at 0x7f3b2c3fa8c8>) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment