Skip to content

Instantly share code, notes, and snippets.

@automenta
Created December 27, 2022 07:39
Show Gist options
  • Select an option

  • Save automenta/553c6e06b5fb8de77e2ba9cf1f23c322 to your computer and use it in GitHub Desktop.

Select an option

Save automenta/553c6e06b5fb8de77e2ba9cf1f23c322 to your computer and use it in GitHub Desktop.
saveModel: False
system:
seed: 0
work_dir: ./out/chargpt
data:
block_size: 512
model:
model_type: gpt-mini
activation: LeakyReLU(negative_slope=0.01)
n_layer: None
n_head: None
n_embd: None
vocab_size: None
block_size: None
attn_pdrop: 0
resid_pdrop: 0
embd_pdrop: 0
trainer:
device: auto
max_iters: 100000.0
batch_size: 4
learning_rate: 0.0003
betas: (0.9, 0.999)
eps: 1e-08
weight_decay: 0.1
grad_norm_clip: None
grad_accumulation_steps: 1
data has 994545 characters, 153 unique.
number of parameters: 6488832
running on device cuda
iter_dt 50.66ms; iter 50: train loss 3.39711
iter_dt 51.95ms; iter 100: train loss 2.69130
iter_dt 51.72ms; iter 150: train loss 2.53639
iter_dt 52.17ms; iter 200: train loss 2.44036
iter_dt 53.16ms; iter 250: train loss 2.38104
iter_dt 51.34ms; iter 300: train loss 2.38313
iter_dt 54.71ms; iter 350: train loss 2.30925
iter_dt 52.37ms; iter 400: train loss 2.22617
iter_dt 54.42ms; iter 450: train loss 2.17664
iter_dt 52.26ms; iter 500: train loss 2.13868
iter_dt 52.61ms; iter 550: train loss 2.01112
iter_dt 56.46ms; iter 600: train loss 1.93057
iter_dt 52.78ms; iter 650: train loss 1.86920
iter_dt 52.82ms; iter 700: train loss 1.81148
iter_dt 53.10ms; iter 750: train loss 1.78335
iter_dt 54.19ms; iter 800: train loss 1.72567
iter_dt 55.32ms; iter 850: train loss 1.66435
iter_dt 53.07ms; iter 900: train loss 1.67970
iter_dt 52.71ms; iter 950: train loss 1.63216
iter_dt 51.16ms; iter 1000: train loss 1.52614
(MOM (reasure ?ROR ?P))))))))))))
(rotiniphionininine 1 Buninising EnglishLanguage "A of &%Atrical this of ofis ethe benntans &%Holone the sheling pencing fon foulline
the wis che ore shas mequgnd &%Chinaralen, wathy pes the to mes ber equation
the thallicr ove andes &%Warine and a bomant ig that ore the a ande anccte tstintinion.")
(=>
(and
(instance ?S Hervingices)
(instance ?W ?P)
(instance ?S Cllicanting)
(exists (?B)
(and
(instance ?P Becect)
(agent ?O ?W)
(instance ?A Com)
(agent ?A ?A)
(instance ?O ?M Agan)
(instance ?S ?P)
(instance ?B Musing)
(exists (?PE ?D)
(and
(instance ?P Coreshing)
(part ?B ?P))
(poss (?H ?C)
(poses ?T ?C)
(contioncEntibus ?B ?RES)
(mesticlabce ?P ?OVE))
(documentation Attteribute ?P ?B))
(instance Intarestion)
(documentation Stad EnglishLanguage "&%Prone
crantal thatAr is that
subPhantiticalOfTumention ?WOC ?PH)
(and
(instance ?P BecinstinObject)
(instance ?T Sprort)
(instance ?X AgalSttical ?I)
(sumess ?O ?D)))))
(subclass Alelal Comars, cantical of a of a a &%Adepar
thate o &%Fean oues a of the &%Bear cof is a
&%Af the ise in the a thabut o on mes the of thin a &%Poceresss
to o a a ctin and in the cat is o a chauced incalAction itans o man is thecine
thale ame and ty a a a be se ond the a is bing cens. o a an a so a tates f the beat.")
(=>
(exists (?ACLACONINENG ?HOPENSOBJ)
(and ?TENGESSOMER ?PEMENOMRNUNT ?SOCENENG)
(instance ?PREROLLENTH Cancte)
(attribute ?OREGREN1 Colenting ?ANE)
(exists (?MENENTEMENG ?HUNTENT)
(and
(equal ?ARE ?OMUT)
(instance ?WECOCEN Selonin)
(mesul ?PEST ?MONGENTRONT)
(ecepation ?RNCENT ?SOMEDERENG
iter_dt 51.53ms; iter 1050: train loss 1.53647
iter_dt 52.59ms; iter 1100: train loss 1.52481
iter_dt 52.41ms; iter 1150: train loss 1.46956
iter_dt 52.04ms; iter 1200: train loss 1.45808
iter_dt 52.69ms; iter 1250: train loss 1.40854
iter_dt 52.35ms; iter 1300: train loss 1.39379
iter_dt 53.02ms; iter 1350: train loss 1.34325
iter_dt 54.96ms; iter 1400: train loss 1.31297
iter_dt 52.46ms; iter 1450: train loss 1.31000
iter_dt 53.54ms; iter 1500: train loss 1.22770
iter_dt 53.37ms; iter 1550: train loss 1.19190
iter_dt 53.32ms; iter 1600: train loss 1.17414
iter_dt 53.74ms; iter 1650: train loss 1.12198
iter_dt 52.02ms; iter 1700: train loss 1.09945
iter_dt 53.77ms; iter 1750: train loss 1.06456
iter_dt 51.95ms; iter 1800: train loss 1.02239
iter_dt 53.25ms; iter 1850: train loss 1.00477
iter_dt 53.35ms; iter 1900: train loss 0.97541
iter_dt 52.85ms; iter 1950: train loss 0.95811
iter_dt 53.83ms; iter 2000: train loss 0.93920
(?TEN ?ANGENT))
(subclass Human Aman)
(documentation Human EnglishLanguage "&%Human on &%Amman which the &%Read of &%Leatinges.
;; nor--105 with sthe the funure of the arean
covers of be to persson of for soff and for and fundement")
(=>
(instance ?H Human)
(hasPurpose ?H
(exists (?A)
(and
(instance ?A Human)
(patient ?H ?H)))))
(instance Human Amman)
(documentation Human EnglishLanguage "A &%Human which sof which used a &%Human.")
(=>
(and
(instance ?H Human)
(instance ?H Human)
(instance ?H Human)
(instance ?H Human)))
(subclass Tempach Will)
(subclass Willl)
(subclass Hempach Seman)
(documentation Tempachach EnglishLanguage "A &%WillleFa &%Humans wor bouns the &%WillleFated.")
(=>
(and
(instance ?W P)
(instance ?W FillleFatedFated)
(experiencer ?W ?FillleF)))
(subclass Tempaphachach FilleFaterFatedFatedFatedredFatedFatedredFatedFatedreFatedFatedreFatedFatedreFatedFatedFatedFatedFatedFatemFatedFatedFatedFatedFatedFoatedFateFatedFatedFatedFareFateFatedFaFeatedFaphateFatedFatedFaFatedFatedFeailFatedFandFatedFateFatedFatedFatedFalFeat aseferFapach
is the heater of ?P infllewTadryFated is the
?F the &%FoweedFaterFatieFaftedFaterFateFatedFatedt.")
(subFeFieateFaterFateFateFateFateFateFnFaterkFateFalteFateFeaFatedFaFeFeattedFaFeFateFeFateFeerFaFeFateFuteFedFaFedFeFateFeFoutt)
(dessFhFeForFn FeaFFFFat ?FF ?F)))
(desteFoFaterFaFeFFaterFArFeFatFeFotFFeFFFFeat))
(resultFoFnFeFFFFeattFeFFFFF EatFeFOFFFF)))
(desjeForFoFeFFFFFeF EFeatFFeFFFFF)
;;FeaFeFFFFFeeeFFFFFF FutiFeFFFFFFF)
;;;FequFFFFFFAFEFFFFFFFFF FreofFFFFFFFF)
;;;; FerwedFrFfFFFFFF WheeeFFFFFF FeaFFBFFFFFO) PeheeFF ?FAFFFFFFFF)
;hoFDFFFFFFFFF) ?FOFFFFFFF)
;egpttiFeFFFFFF FeatwFFFFFFFFHFODFFFW)
;eedFFFFFFFFFFFFFFF ?F))))
(instanceFeFeForFFFFFFFFFHF ?WHFFFHFFFHFD)
(FoweFFFFFFFFFFFFFFFDIFFFFFF ?BFFFFFFFFFFFFFFFF)))))
(domatedFowFFFFavFFFFFFFF EawFFFFFFFFFFFFFFF)
;olephFFFFFFFFFFFFFF ?WIFPHDFFFFFFHFT)))
(=>
(andFFFFFFFFFFFFFFFTHFFFFFF)
(andF
iter_dt 52.31ms; iter 2050: train loss 0.93479
iter_dt 52.95ms; iter 2100: train loss 0.88672
iter_dt 56.21ms; iter 2150: train loss 0.89765
iter_dt 52.67ms; iter 2200: train loss 0.85547
iter_dt 52.51ms; iter 2250: train loss 0.84808
iter_dt 52.58ms; iter 2300: train loss 0.79678
iter_dt 53.56ms; iter 2350: train loss 0.82491
iter_dt 51.93ms; iter 2400: train loss 0.81516
iter_dt 52.10ms; iter 2450: train loss 0.84013
iter_dt 53.95ms; iter 2500: train loss 0.82333
iter_dt 52.42ms; iter 2550: train loss 0.79247
iter_dt 53.52ms; iter 2600: train loss 0.75250
iter_dt 53.48ms; iter 2650: train loss 0.77318
iter_dt 52.30ms; iter 2700: train loss 0.75480
iter_dt 54.22ms; iter 2750: train loss 0.79628
iter_dt 59.80ms; iter 2800: train loss 0.73341
iter_dt 52.83ms; iter 2850: train loss 0.74871
iter_dt 54.11ms; iter 2900: train loss 0.79652
iter_dt 54.85ms; iter 2950: train loss 0.77823
iter_dt 53.10ms; iter 3000: train loss 0.75170
(greatedAttribute)
(domainSubclass HandAusit 1 Musition)
(roomainSubclass HandAusit 2 Measurement)
(subclass HandAusit HandAcialToxice)
(documentation HandAusit EnglishLanguage "The causition an &%HandArit by an &%Measures the which is caused areily
&%Measure.")
(=>
(instance ?X HandAuist)
(exists (?MEasure ?MEASITE ?MEASITE)
(and
(instance ?MEASITE Measure)
(part ?MEASITE ?MEASITE ?MEASITE)
(instrument ?MEASITE ?MEASITE ?MEASITE)
(part ?MEASITE ?MEASITE ?MEASITE)
(instance ?MEASITE Measure)
(result ?MEASITE ?MEASITE)
(destination ?MEASITE ?MEASITE))))))
(subclass Buildinating Sotain)
(documentation Buildinating EnglishLanguage "&%Buildinating a type of &%Buildinating a &%Buildinating of a
&%Buildinating released and behele a &%Sotaing covernts and to are &%Meain.")
(=>
(instance ?B Buildinating)
(exists (?SOTAITE ?BUILDINITE)
(and
(instance ?SOTAITE Soth)
(part ?BUILDIN ?SOUTIOATE ?BUILDINITEA ?BUILDINITEA ?BUILDINITEater ?BUILDINITEA
(part ?BUILDINITEatEd BuildinatingBuildinating ?SOTAITEA is a to of &%Sotain &%SotainsProcess instrument ?BUILDIN meains sounded by &%Process of a &%Wain
the connected or cullys of any &%Process ?AGENT water")
(instance pareatingFn TearnaryPredicate)
(domainSubclass pareatingFn 1 Buildinating)
(subclass WabldinatingFn Buildinating)
(documentation WabldinatingFn EnglishLanguage "(pareatingFn ?BUILDININITEA ?WANITEA) means that instrument of &%Paracting
of &%BuildinatingFn that is a &%BuildinatingFn intended
the &%BuildinatingFn
in wabldinatingFn in connected bedienating an deating water &%Buildinating from in &%Wabolid an
the &%Wabolmical of an an &%Wabolid of the particing of ?BUILDINIST is a
of cereating concered an the preciply is an &%Wabolidating a an connected or agent to or convers of a mone
to bown or to to &%Wabolidating an of an organ on organ indifating an reals fromal
through to expartic.")
(termFormat EnglishLanguage RelatingAnatingAnalAnalatingAnalalAnalatio
iter_dt 52.80ms; iter 3050: train loss 0.71776
iter_dt 52.55ms; iter 3100: train loss 0.73789
iter_dt 53.80ms; iter 3150: train loss 0.71101
iter_dt 53.77ms; iter 3200: train loss 0.68471
iter_dt 57.74ms; iter 3250: train loss 0.73179
iter_dt 52.69ms; iter 3300: train loss 0.68948
iter_dt 52.19ms; iter 3350: train loss 0.71829
iter_dt 53.36ms; iter 3400: train loss 0.70787
iter_dt 53.03ms; iter 3450: train loss 0.68357
iter_dt 53.16ms; iter 3500: train loss 0.66007
iter_dt 52.12ms; iter 3550: train loss 0.71271
iter_dt 52.99ms; iter 3600: train loss 0.69405
iter_dt 52.70ms; iter 3650: train loss 0.69863
iter_dt 52.73ms; iter 3700: train loss 0.66389
iter_dt 53.72ms; iter 3750: train loss 0.65003
iter_dt 54.59ms; iter 3800: train loss 0.68000
iter_dt 53.97ms; iter 3850: train loss 0.66724
iter_dt 55.04ms; iter 3900: train loss 0.66670
iter_dt 58.95ms; iter 3950: train loss 0.64262
iter_dt 52.18ms; iter 4000: train loss 0.69179
(=>
(and
(instance ?H Human)
(patient ?H ?P)
(patient ?H ?P)
(patient ?H ?D))
(instance ?H Human))
(=>
(and
(patient ?H ?H)
(attribute ?H Flow))
(not
(exists (?H)
(and
(instance ?H Human)
(patient ?H ?H))))))
(=>
(and
(patient ?H ?H)
(destination ?H ?H))))
(and
(patient ?H ?H)))
(instance ContentBearingObject)
(documentation CContentBearingObject EnglishLanguage "A &%ContentBearingObject that is
the &%Blood of &%ContentBoaringObject in
within the &%ContentBearingObject that is paint riving &%Containe.")
(instance ContentBaringObject Solid)
(instance ContentBoaringObject)
(documentation ContentBoaringObject EnglishLanguage "Any &%ContentBoaringObjects with the dist incentified
an &%Blood that are starts of a &%ContentBoaringObject.")
(=>
(and
(instance ?B ContentBoaringObject)
(resource ?B ?B))
(exists (?C)
(and
(distance ?C ContentBoaringObject)
(resource ?C ?C))))
(subclass ArriboArrifact ArmitalArribove)
(documentation ArriboArrifact EnglishLanguage "Any &%Walking for the &%Hols.")
(=>
(and
(instance ?ARRICHICHICHICHICHICHICHICHICHICHICHICCHICHICHICHICHICHIChisipHipinians ArriboArries ArriboArribo
arriboard in order in ?ARRICHICHICHICHICHICRICHICHICHICHICHICHICHICHICHICHICHICHICLIVICHICHICHICHIWCHICHICHICHICHICHICHICHICHICHICHICHISCHICHICHICHIWCHICHICHICHICHICHICHICHICHICHICHICHICHICHICHICATINCHICHICHIC
in ?MMMMMMMMMMMMMMMMMMMMMMMMMmmmMMMMMMMmmMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM))
(=>
(and
(instance ?MMMMMMMMMMMMMMMMMMMMMMM) MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM)
MeasurGmnnvfn ?MMMMMMMMMMMMMMMMMMMMMMMMMMM)
(MeasureFn ?MMMMMMMMMMMMM) MMMMMUMMMMMMMMMMMM))
(boirMMeasurcMeasurMasncHinvMonin ?MMMMMMMMMMMMMMMMMMAMMMMM))
(measureMasuringMoninvMoninVin)
(MeasuringPoinnt MavMMalmMaximMacNumndInLinHumumumMeasnicNumbeat)
(MeasureMininvMeasurcLenngToN MaxiMoniumMeas)
(MeasureMiningPoint MavinMacniumMeasurium (MeasureFn 219 MeasureMasuregOfN
(MeasureFn MalvinMam
iter_dt 52.63ms; iter 4050: train loss 0.64915
iter_dt 53.25ms; iter 4100: train loss 0.65344
iter_dt 57.01ms; iter 4150: train loss 0.64663
iter_dt 52.20ms; iter 4200: train loss 0.63649
iter_dt 52.59ms; iter 4250: train loss 0.62143
iter_dt 52.87ms; iter 4300: train loss 0.61081
iter_dt 52.61ms; iter 4350: train loss 0.62691
iter_dt 53.34ms; iter 4400: train loss 0.61853
iter_dt 56.57ms; iter 4450: train loss 0.59507
iter_dt 52.23ms; iter 4500: train loss 0.64428
iter_dt 53.46ms; iter 4550: train loss 0.59151
iter_dt 52.94ms; iter 4600: train loss 0.63137
iter_dt 53.64ms; iter 4650: train loss 0.62865
iter_dt 52.52ms; iter 4700: train loss 0.60071
iter_dt 52.71ms; iter 4750: train loss 0.60802
iter_dt 58.66ms; iter 4800: train loss 0.61305
iter_dt 52.52ms; iter 4850: train loss 0.57242
iter_dt 55.68ms; iter 4900: train loss 0.59254
iter_dt 53.20ms; iter 4950: train loss 0.63232
iter_dt 54.58ms; iter 5000: train loss 0.58909
(?P)))
(subclass Perading OrganismProgram)
(documentation Perading EnglishLanguage "Any &%OrganismProgram that is designed from the
contains of and forms the has a numane organism.")
(subclass Perase Organ)
(documentation Perase EnglishLanguage "A &%OrganismProgram that serves to large
of should by of an acrougtanism of &%OrganismProgram and in
where hen instrument remic a &%Permaneutic of and used-to bused organisms
and time bused indon.")
(=>
(instance ?P Perase)
(hasPurpose ?P
(exists (?HEAD ?P ?ROOP ?P)
(and
(instance ?P MeasuringProgram)
(instance ?P Permanent)
(holdsDuring
(EndFn
(WhenFn ?P)
(attribute ?P Demotic))))))))
(subclass Rooms Attribute)
(documentation Rooms EnglishLanguage "An &%Attribute which is and organ the
lower contibion of a &%Rooms or where that the chest of a
&%Rooting or surrounder but work for provide which is some
to chest a can often and or pirchase in involves to a &%Attribute and power a &%Rootion.")
(=>
(instance ?ROOT AttoribicString)
(exists (?LOC)
(and
(instance ?LOC Rooting)
(instrument ?ROOT ?ROOT)))))
(instance SocialConten Rootion)
(documentation SocialContent EnglishLanguage "&%SocialContent that a &%SocialContent &%SocialContent
instance or an other &%SocialContentBearingObject hydrong the
contains of into content or the content its differet for breeding to contains
eyering.")
(subclass SocialContent SocialContent)
(documentation SocialContent EnglishLanguage "&%SocialContent that an &%Animal
into the acert is publy and ressumed to the activities contains the
social an &%Animal &%SocialContent to another socially social
a &%SucialContent that for the social contains and or
contains are dong the social of a social often intoxic.")
(=>
(instance ?C SocialContent)
(exists (?S)
(and
(instance ?S SocialContent)
(instance ?S SocialContent)))))
(instance SocialContentB SocialContentBearingProving)
(documentation SocialContent
iter_dt 53.04ms; iter 5050: train loss 0.62656
iter_dt 53.24ms; iter 5100: train loss 0.56571
iter_dt 54.62ms; iter 5150: train loss 0.56843
iter_dt 53.41ms; iter 5200: train loss 0.60219
iter_dt 53.37ms; iter 5250: train loss 0.55493
iter_dt 53.26ms; iter 5300: train loss 0.60102
iter_dt 52.69ms; iter 5350: train loss 0.56496
iter_dt 52.27ms; iter 5400: train loss 0.56369
iter_dt 53.81ms; iter 5450: train loss 0.57958
iter_dt 52.63ms; iter 5500: train loss 0.59037
iter_dt 52.29ms; iter 5550: train loss 0.58885
iter_dt 52.39ms; iter 5600: train loss 0.56839
iter_dt 52.67ms; iter 5650: train loss 0.57942
iter_dt 52.84ms; iter 5700: train loss 0.58650
iter_dt 52.16ms; iter 5750: train loss 0.56483
iter_dt 52.42ms; iter 5800: train loss 0.55676
iter_dt 52.82ms; iter 5850: train loss 0.53826
iter_dt 53.13ms; iter 5900: train loss 0.57375
iter_dt 53.84ms; iter 5950: train loss 0.58500
iter_dt 53.51ms; iter 6000: train loss 0.56519
(subclass Landless Container)
(documentation Landles EnglishLanguage "Any &%Position where the lands of the &%BodyPressures
the &%Container.")
(=>
(and
(instance ?LANDLE Landless)
(patient ?LANDLE ?POSIT)
(instance ?POSIT Container)
(instrument ?LANDLES ?POSIT)
(instance ?POSIT Position)
(patient ?POSIT ?POSIT)
(instance ?POSIT Position)
(instance ?POSIT Position)
(instance ?POSIT Position)
(origin ?POSIT (refers ?POSIT ?POSIT))
(instance ?POSIT SocialPosition)
(equal ?POSIT
(PositionFn ?POSIT))
(equal ?LAND (PositionFn ?POSIT))
(equal
(PositionFn (WhenFn ?POSIT)) (attribute ?POSIT ?POSIT))
(documentation Positions EnglishLanguage "The spiecies of a positions
minitures is a positions in to act in umbet that the first.")
(=>
(and
(agent ?P ?P)
(instance ?P Positions)
(agent ?P ?H)
(positions ?P ?P ?I)
(desires ?P ?P)
(positions ?P ?P)
(positions ?P ?N)
(exists (?L)
(and
(instance ?L Liquid)
(positions ?P ?L)
(instance ?L Liquid)))
(=>
(instance ?P Positions)
(exists (?T ?H)
(and
(instance ?T Transsiciping)
(patient ?T ?H)
(instance ?T Translocation)
(experiencer ?T ?O)
(or
(instance ?T Translocation)
(instance ?T Translocation)
(experiencer ?T ?H)
(exists (?E)
(and
(instance ?E Translocation)
(patient ?E ?H)
(instrument ?T ?T)))
(subclass Scandil Organ)
(subclass Scandil Scandil)
(documentation Scandil EnglishLanguage "&%Scandil is a type of two is can class that is in
the &%Expering of the &%Scandil.")
(=>
(instance ?S Scandil)
(can ?S ?H))
(=>
(instance ?S Scandil)
(capability Scandil patient ?S))
(subclass Scandil Scandil)
(termFormat EnglishLanguage Scandil "scandil)
(documentation Scandil EnglishLanguage "Any &%Scome that has the &%Scandil to intermed by the
periods of any &%Scandil.")
(subclass Performance Scandil)
(termFormat EnglishLanguage Scandil "scandil")
(documentation Performance EnglishLanguage
iter_dt 58.76ms; iter 6050: train loss 0.54940
iter_dt 52.22ms; iter 6100: train loss 0.55355
iter_dt 52.01ms; iter 6150: train loss 0.55005
iter_dt 54.23ms; iter 6200: train loss 0.58589
iter_dt 55.31ms; iter 6250: train loss 0.57541
iter_dt 52.04ms; iter 6300: train loss 0.60238
iter_dt 52.37ms; iter 6350: train loss 0.54212
iter_dt 52.96ms; iter 6400: train loss 0.54974
iter_dt 52.14ms; iter 6450: train loss 0.53999
iter_dt 53.76ms; iter 6500: train loss 0.55883
iter_dt 51.94ms; iter 6550: train loss 0.54497
iter_dt 53.99ms; iter 6600: train loss 0.52057
iter_dt 52.64ms; iter 6650: train loss 0.54162
iter_dt 52.75ms; iter 6700: train loss 0.56164
iter_dt 52.56ms; iter 6750: train loss 0.56072
iter_dt 52.27ms; iter 6800: train loss 0.52944
iter_dt 51.92ms; iter 6850: train loss 0.52597
iter_dt 52.75ms; iter 6900: train loss 0.53782
iter_dt 53.43ms; iter 6950: train loss 0.52239
iter_dt 52.48ms; iter 7000: train loss 0.52952
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-51-ebed62e87fc8> in <module>
111
112 # run the optimization
--> 113 trainer.run()
12 frames
<ipython-input-50-e041bcc85fd1> in forward(self, x)
183 att = self.attn_dropout(att)
184
--> 185 y = att @ v # (B, nh, T, T) x (B, nh, T, hs) -> (B, nh, T, hs)
186 y = y.transpose(1, 2).contiguous().view(B, T, C) # re-assemble all head outputs side by side
187
KeyboardInterrupt:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment