Skip to content

Commit

Permalink
modified
Browse files Browse the repository at this point in the history
  • Loading branch information
Demi-wlw committed Jul 28, 2024
1 parent 0525bf5 commit 8910fa7
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions _posts/2023-03-19-ChatGPT.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,9 @@ GPT has been a major breakthrough in natural language processing and the version
The term _generative pre-training_ represents the unsupervised pre-training of the generative model.<d-footnote>They used a multi-layer Transformer decoder to produce an output distribution over target tokens.</d-footnote> Given an unsupervised corpus of tokens $\mathcal{U} = (u_1,\dots,u_n)$, they use a standard language modelling objective to maximize the following likelihood:
{: .text-justify}

$
$$
L_1(\mathcal{U})=\sum_i\log P(u_i\mid u_{i-k},\dots,u_{i-1};\Theta)
$
$$

where $k$ is the size of the context window, and the conditional probability $P$ is modelled using a neural network with parameters $\Theta$ trained using stochastic gradient descent. **Intuitively, we train the Transformer-based model to predict the next token within the $k$-context window using unlabeled text from which we also extract the latent features $h$.**
{: .text-justify}
Expand Down

0 comments on commit 8910fa7

Please sign in to comment.