Information technology — Plenoptic image coding system (JPEG Pleno) — Part 2: Light field coding

This document specifies a coded codestream format for storage of light field modalities as well as associated metadata descriptors that are light field modality specific. This document also provides information on the encoding tools.

Technologies de l'information — Système de codage d'images plénoptiques (JPEG Pleno) — Partie 2: Codages des champs de lumière

General Information

Status
Published
Publication Date
07-Apr-2021
Current Stage
6060 - International Standard published
Start Date
08-Apr-2021
Due Date
25-Aug-2020
Completion Date
08-Apr-2021
Ref Project

Relations

Buy Standard

Standard
ISO/IEC 21794-2:2021 - Information technology -- Plenoptic image coding system (JPEG Pleno)
English language
117 pages
sale 15% off
Preview
sale 15% off
Preview
Draft
ISO/IEC FDIS 21794-2:Version 02-jan-2021 - Information technology -- Plenoptic image coding system (JPEG Pleno)
English language
117 pages
sale 15% off
Preview
sale 15% off
Preview

Standards Content (Sample)

INTERNATIONAL ISO/IEC
STANDARD 21794-2
First edition
2021-04
Information technology — Plenoptic
image coding system (JPEG Pleno) —
Part 2:
Light field coding
Technologies de l'information — Système de codage d'images
plénoptiques (JPEG Pleno) —
Partie 2: Codages des champs de lumière
Reference number
ISO/IEC 21794-2:2021(E)
©
ISO/IEC 2021

---------------------- Page: 1 ----------------------
ISO/IEC 21794-2:2021(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2021 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC 21794-2:2021(E)

Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols and abbreviated terms . 3
4.1 Symbols . 3
4.2 Abbreviated terms . 7
5 Conventions . 8
5.1 Naming conventions for numerical values . 8
5.2 Operators . 8
5.2.1 Arithmetic operators . 8
5.2.2 Logical operators . 9
5.2.3 Relational operators . 9
5.2.4 Precedence order of operators . 9
5.2.5 Mathematical functions .10
6 General .10
6.1 Functional overview on the decoding process .10
6.2 Encoder requirements .11
6.3 Decoder requirements.11
7 Organization of the document .11
Annex A (normative) JPEG Pleno Light Field superbox .12
Annex B (normative) 4D transform mode .29
Annex C (normative) JPEG Pleno light field reference view decoding .73
Annex D (normative) JPEG Pleno light field normalized disparity view decoding .81
Annex E (normative) JPEG Pleno Light Field Intermediate View superbox .89
Bibiliography .117
© ISO/IEC 2021 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC 21794-2:2021(E)

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that
are members of ISO or IEC participate in the development of International Standards through
technical committees established by the respective organization to deal with particular fields of
technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with ISO and IEC, also
take part in the work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives or www .iec .ch/ members
_experts/ refdocs).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC
list of patent declarations received (see patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT) see www .iso .org/
iso/ foreword .html. In the IEC, see www .iec .ch/ understanding -standards.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 21794 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html and www .iec .ch/ national
-committees.
iv © ISO/IEC 2021 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC 21794-2:2021(E)

Introduction
This document is part of a series of standards for a system known as JPEG Pleno. This document defines
the JPEG Pleno framework. It facilitates the capture, representation, exchange and visualization of
plenoptic imaging modalities. A plenoptic image modality can be a light field, point cloud or hologram,
which are sampled representations of the plenoptic function in the form of, respectively, a vector
function that represents the radiance of a discretized set of light rays, a collection of points with
position and attribute information, or a complex wavefront. The plenoptic function describes the
radiance in time and in space obtained by positioning a pinhole camera at every viewpoint in 3D spatial
coordinates, every viewing angle and every wavelength, resulting in a 7D function.
JPEG Pleno specifies tools for coding these modalities while providing advanced functionality at system
level, such as support for data and metadata manipulation, editing, random access and interaction,
protection of privacy and ownership rights.
© ISO/IEC 2021 – All rights reserved v

---------------------- Page: 5 ----------------------
INTERNATIONAL STANDARD ISO/IEC 21794-2:2021(E)
Information technology — Plenoptic image coding system
(JPEG Pleno) —
Part 2:
Light field coding
1 Scope
This document specifies a coded codestream format for storage of light field modalities as well as
associated metadata descriptors that are light field modality specific. This document also provides
information on the encoding tools.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ITU-T Rec. T.800 | ISO/IEC 15444-1, Information technology — JPEG 2000 image coding system — Part 1:
Core coding system
ITU-T Rec. T.801 | ISO/IEC 15444-2, Information technology — JPEG 2000 image coding system — Part 2:
Extensions
ISO/IEC 21794-1:2020, Information technology — Plenoptic image coding system (JPEG Pleno) — Part 1:
Framework
ISO/IEC 60559, Information technology — Microprocessor Systems — Floating-Point arithmetic
3 Terms and definitions
For the purposes of this document the terms and definitions given in ISO/IEC 21794-1 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
arithmetic coder
entropy coder that converts variable length strings to variable length codes (encoding) and vice versa
(decoding)
3.2
bit-plane
two-dimensional array of bits
3.3
4D bit-plane
four-dimensional array of bits
© ISO/IEC 2021 – All rights reserved 1

---------------------- Page: 6 ----------------------
ISO/IEC 21794-2:2021(E)

3.4
coefficient
numerical value that is the result of a transformation or linear regression
3.5
compression
reduction in the number of bits used to represent source image data
3.6
depth
distance of a point in 3D space to the camera plane
3.7
disparity view
image that for each pixel of the subaperture view contains the apparent pixel shift between two
subaperture views along either horizontal or vertical axis
3.8
hexadeca-tree
division of a 4D region into 16 (sixteen) 4D subregions
3.9
pixel
collection of sample values in the spatial image domain having all the same sample coordinates
EXAMPLE A pixel may consist of three samples describing its red, green and blue value.
3.10
plenoptic function
amount of radiance in time and in space by positioning a pinhole camera at every viewpoint in 3D
spatial coordinates, every viewing angle and every wavelength, resulting in a 7D representation
3.11
reference view
subaperture view that is used as one of the references to generate the intermediate views
3.12
subaperture view
subaperture image
image taken of the 3D scene by a pinhole camera positioned at a particular viewpoint and viewing angle
3.13
texture
pixel attributes
EXAMPLE Colour information, opacity, etc.
3.14
transform
transformation
mathematical mapping from one signal space to another
2 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC 21794-2:2021(E)

4 Symbols and abbreviated terms
4.1 Symbols
Codestream_Body() coded image data in the codestream without Codestream_Header()
Codestream_Header() codestream header preceding the image data in the codestream

DEC
Dt,,sv,u
()
decoded normalized disparity value at view ts, for pixel location vu,
() ()


Dt,,sv,u
()
normalized disparity value at view ts, for pixel location vu,
() ()

DPEC
k pointer to contiguous codestream for normalized disparity view k

scaling parameter to translate quantized normalized disparity maps to pos-
D
shift
itive range
DCODEC disparity view codec type
f focal length

FPW
p fixed-weight merging parameter for view p

Ht,s
()
view hierarchy value for view ts,
()

HCCt,s
()
horizontal camera centre coordinate for view ts,
()

Ht,s
()
D
binary value defining the availability of a normalized disparity view ts,
()

J
0 Lagrangian encoding cost

J
1 Lagrangian encoding cost of spatial partitioning

J
2 Lagrangian encoding cost of view partitioning

KR
pc, sparse filter regressor mask of texture component c for view p
LightField() JPEG Pleno light field codestream

pc,
quantized least-squares merging weight of texture component c for view p ,
LSW
j
jN=…12,, , LS
p
MIDV absolute value of the minimum value over all quantized normalized disparity views
© ISO/IEC 2021 – All rights reserved 3

---------------------- Page: 8 ----------------------
ISO/IEC 21794-2:2021(E)


MMODE
p view merging mode for intermediate view p

MSP
p sparse filter order for view p

NLS
p number of least-squares merging coefficients for intermediate view p

NRT
p regressor template size parameter for sparse filter for view p
NC number of components in an image

N
I number of intermediate views

N
NDV number of reference normalized disparity views

D
N
number of normalized disparity reference views for intermediate view p
p

T
N
number of texture reference views for intermediate view p
p

N
REF number of reference views

N
RES number of prediction residual views

N
sp total available number of regressors for sparse filter
Plev level a particular codestream complies to
Ppih profile a particular codestream complies to

2D image of dimensions VU× , defines the occlusion state-based segmentation
Q
p
at Intermediate view p
Q normalized disparity quantization parameter
R rate or bitrate, expressed in bit per sample
RCODEC prediction residual view codec type
array of bytes containing for a single prediction residual view the RCODEC
RDATA
codestream after header information has been stripped
array of bytes containing for a single prediction residual view the full DCODEC
RENCODING
codestream
RGB colour data for the red, green and blue colour component of a pixel
array of bytes containing for a single prediction residual view the header infor-
RHEADER
mation from the RCODEC codestream
4 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC 21794-2:2021(E)


RPEC
j pointer to contiguous codestream for prediction residual view j
s
coordinate of the addressed subaperture image along the s-axis
S
size of the light field image along the s-axis (COLUMNS)

T
Tr subscript of the column index of the reference view, ii=…12,, ,N in the light
s
p
ii
field array in row-wise scanning order

Dr
subscript of the column index of the reference normalized disparity view,
s
jj
D
jj=…12,, ,N in the light field array in row-wise scanning order
p

SF
p binary variable, determines if sparse filter is used (true) or not (false)

p,0
SPW
j quantized sparse filter coefficients of texture component c for view p , jM=…12,, , SP
p

pc,
de-quantized sparse filter coefficients of texture component c for view p ,

SPW
j
jM=…12,, , SP
p
t coordinate of the addressed subaperture image along the t-axis
T size of the light field image along the t-axis (ROWS)

T
Tr subscript of the row index of the reference view, ii=…12,, ,N in the light field
t
p
ii
array in row-wise scanning order

D
Dr
subscript of the row index of the reference normalized disparity view, jj=…12,, ,N
t
p
jj
in the light field array in row-wise scanning order

D D
ts,
()
view coordinate subscripts for normalized disparity view k
k k

X X
ts,
()
l l view coordinate subscripts for reference view l

I I
ts,
()
view coordinate subscripts for intermediate view p
p p

ts××vu×
kk kk 4D block dimensions at the 4D block partitioning stage

ts××vu×
bb bb 4D block dimensions at the bit-plane hexadeca-tree decomposition stage
TCODEC reference view codec type
array of bytes, containing for a single reference view, the TCODEC codestream,
TDATA
after header information has been stripped
© ISO/IEC 2021 – All rights reserved 5

---------------------- Page: 10 ----------------------
ISO/IEC 21794-2:2021(E)

TENCODING array of bytes, containing for a single reference view the full TCODEC codestream
array of bytes, containing for a single reference view the header information
THEADER
from the TCODEC codestream

TPEC
l pointer to contiguous codestream for reference view l
u sample coordinate along the u-axis within the addressed subaperture image
U size of the subaperture image along the u-axis (WIDTH)
v sample coordinate along the v-axis within the addressed subaperture image
V size of the subaperture image along the v-axis (HEIGHT)

VCCt,s
()
vertical camera centre coordinate for view ts,
()

VPP
p view prediction parameters for intermediate view p

Xt,,sv,,uc
()
texture value at view ts, for pixel location vu, for texture component c
() ()

DEC
decoded texture value at view ts, for pixel location vu, for texture com-
() ()
Xt,,sv,,uc
()
ponent c

ts,
()
11
Xt ,s
()
result of warping the texture view ts, to view location ts,
W 22 () ()
11 22

Δx
horizontal distance between a pair of camera centres

Δy
vertical distance between a pair of camera centres
colour data for the luminance, the blue chrominance and the red chrominance
YCbCr
component of a pixel

zt,,sv,u
()
depth value at view ts, for pixel location vu,
() ()

T
p
distance based merging weight for reference view iN=…1,, at intermedi-
ˆ
θ
p
i
ate view p

p
distance based factor, used for defining the merging weight, at intermediate
α
i
T
view p for reference view iN=…1,,
p

binary matrix, defining the locations of the non-zero merging weights in merg-
Γ
ing weight matrix Θ at intermediate view p . It is identical between all colour
p
pc,
components c
6 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC 21794-2:2021(E)


pc,
de-quantized least-squares merging weight of texture component c for view p
θ
j
, jN=…12,, , LS
p

sp
θ
pc, sparse filter coefficients at intermediate view p for colour component c

Θ
pc, merging weight matrix for intermediate view p for colour component c

Υ
pc,
locations of the non-zero elements of Ψ
vu,
()

Ψ
vu,
() regressor template at pixel location vu,
()

Dr
Ω
p set of reference normalized disparity views for intermediate view p

occlD
set of occluded pixels, which remain to be inpainted, during normalized dispar-
Ω
p
ity view synthesis at intermediate view p

occlT
set of occluded pixels, which remain to be inpainted, during texture view syn-
Ω
p
thesis at intermediate view p

Tr
Ω
p
set of reference views for intermediate view p
4.2 Abbreviated terms
2D two dimensional
3D three dimensional
4D four dimensional
DCT discrete cosine transform
floating point floating point notation as specified in ISO/IEC 60559
HTTP hypertext transfer protocol
IDCT inverse DCT
IPR intellectual property rights
IV intermediate view; subaperture view that is generated from surrounding refer-
ence view(s)
JPEG Joint Photographic Experts Group
JPL JPEG Pleno file format
LSB least significant bit
© ISO/IEC 2021 – All rights reserved 7

---------------------- Page: 12 ----------------------
ISO/IEC 21794-2:2021(E)

MSB most significant bit
R-D rate-distortion
RV reference view
URL uniform resource locator
XML eXtensible Markup Language
5 Conventions
5.1 Naming conventions for numerical values
Integer numbers are expressed as bit patterns, hexadecimal values or decimal numbers. Bit patterns
and hexadecimal values have both a numerical value and an associated particular length in bits.
Hexadecimal notation, indicated by prefixing the hexadecimal number by "0x", may be used instead
of binary notation to denote a bit pattern having a length that is an integer multiple of 4. For example,
0x41 represents an eight-bit pattern having only its second most significant bit and its least significant
bit equal to 1. Numerical values that are specified under a "Code" heading in tables that are referred to
as "code tables" are bit pattern values (specified as a string of digits equal to 0 or 1 in which the left-
most bit is considered the most-significant bit). Other numerical values not prefixed by "0x" are decimal
values. When used in expressions, a hexadecimal value is interpreted as having a value equal to the
value of the corresponding bit pattern evaluated as a binary representation of an unsigned integer (i.e.
as the value of the number formed by prefixing the bit pattern with a sign bit equal to 0 and interpreting
the result as a two's complement representation of an integer value). For example, the hexadecimal
value 0xF is equivalent to the 4-bit pattern '1111' and is interpreted in expressions as being equal to the
decimal number 15.
5.2 Operators
NOTE Many of the operators used in document are similar to those used in the C programming
language.
5.2.1 Arithmetic operators
+ addition
− subtraction (as a binary operator) or negation (as a unary prefix operator)
× multiplication
/ division without truncation or rounding
s
<< left shift; x< s
>> right shift; x>>s is defined as ⎿x/2 ⏌
++ increment with 1
-- decrement with 1
8 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 13 ----------------------
ISO/IEC 21794-2:2021(E)

umod x umod a is the unique value y between 0 and a–1
for which y+Na = x with a suitable integer N
& bitwise AND operator; compares each bit of the first operand to the corresponding bit
of the second operand
If both bits are 1, the corresponding result bit is set to 1. Otherwise, the corresponding
result bit is set to 0.
^ bitwise XOR operator; compares each bit of the first operand to the corresponding bit
of the second operand
If both bits are equal, the corresponding result bit is set to 0. Otherwise, the correspond-
ing result bit is set to 1.
5.2.2 Logical operators
|| logical OR
&& logical AND
! logical NOT
5.2.3 Relational operators
> greater than
>= greater than or equal to
< less than
<= less than or equal to
== equal to
!= not equal to
5.2.4 Precedence order of operators
Operators are listed in descending order of precedence. If several operators appear in the same line,
they have equal precedence. When several operators of equal precedence appear at the same level in an
expression, evaluation proceeds according to the associativity of the operator either from right to left
or from left to right.
Operators Type of operation Associativity
() expression left to right
[] indexing of arrays left to right
++, -- increment, decrement left to right
!, – logical not, unary negation
© ISO/IEC 2021 – All rights reserved 9

---------------------- Page: 14 ----------------------
ISO/IEC 21794-2:2021(E)

×, / multiplication, division left to right
umod modulo (remainder) left to right
+, − addition and subtraction left to right
& bitwise AND left to right
^ bitwise XOR left to right
&& logical AND left to right
|| logical OR left to right
<<, >> left shift and right shift left to right
< , >, <=, >= relational left to right
5.2.5 Mathematical functions
|x| absolute value, is –x for x < 0, otherwise x
sign(x) sign of x, zero if x is zero, +1 if x is positive, -1 if x is negative
clamp(x,min,max) clamps x to the range [min,max]: returns min if x < min, max if x > max or
otherwise x
⎾x⏋ ceiling of x; returns the smallest integer that is greater than or equal to x
⎿x⏌ floor of x; returns the largest integer that is less than or equal to x
⎿x⏋
rounding of x to the nearest integer, equivalent to sign xx+05.
()
6 General
6.1 Functional overview on the decoding process
This document specifies the JPEG Pleno Light Field superbox and the JPEG Pleno light field decoding
algorithm. The generic JPEG Pleno Light Field superbox syntax is specified in Annex A.
The specified light field decoding algorithm distinguishes two coding modes:
— 4D Transform mode: this mode is specified in Annex B and is based on a 4D inverse discrete cosine
transform (IDCT) and 4D block partitioning and 4D bit-plane hexadeca-tree decoding;
— 4D Prediction mode: this mode is based the prediction of intermediate views based on reference
views and normalized disparity maps. The signalling syntax and decoding of the reference views is
addressed in Annex C, the normalized disparity views in Annex D, and the prediction parameters
and residual views in Annex E. The intermediate views are reconstructed in a decoding process that
involves view warping, view merging and prediction error correction.
The overall architecture (Figure 1) provides the flexibility to configure the encoding and decoding
system depending on the requirements of the addressed use case.
10 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 15 ----------------------
ISO/IEC 21794-2:2021(E)

Figure 1 — Generic JPEG Pleno light field decoder architecture
6.2 Encoder requirements
An encoding process converts source light field data to coded light field data.
In order to conform with this document, an encoder shall conform with the codestream format syntax
and file format syntax specified in the annexes for the encoding process(es) embodied by the encoder.
6.3 Decoder requirements
A decoding process converts coded light field data to reconstructed light field data. Annexes A through
E describe and specify the decoding process.
A decoder is an embodiment of the decoding process. In order to conform to this document, a decoder
shall convert all, or specific parts of, any coded light field data that conform to the file format syntax
and codestream syntax specified in Annex A to E to a reconstructed light field.
7 Organization of the document
Annex A specifies the description of the JPEG Pleno Light Field superbox.
This document specifies two approaches to represent a compressed representation of light field data:
the 4D Transform mode is specified in Annex B and the 4D Prediction mode is specified Annex C,
Annex D and Annex E. Annex C details the signalling of the reference view data, Annex D the signalling
of the normalized disparity views and finally, Annex E the signalling of the prediction parameters to
generate the intermediate views and residual view data to compensate for prediction errors.
© ISO/IEC 2021 – All rights reserved 11

---------------------- Page: 16 ----------------------
ISO/IEC 21794-2:2021(E)

Annex A
(normative)

JPEG Pleno Light Field superbox
A.1 General
This annex specifies the use of the JPEG Pleno Light Field superbox which is designed to contain
compressed light field data and associated metadata. The listed boxes shall comply with their definitions
as specified in ISO/IEC 21794-1.
This document may redefine the binary structure of some boxes defined as part of the ISO/IEC 15444-1
or ISO/IEC 15444-2 file formats. For those boxes, the definition found in this document shall be used for
all JPL files.
A.2 Organization of the JPEG Pleno Light Field superbox
Figure A.1 shows the hierarchical organization of the JPEG Pleno Light Field superbox contained by a
JPL file. This illustration does not specify nor imply a specific order to these boxes. In many cases, the
file will contain several boxes of a particular box type. The meaning of each of those boxes is dependent
on the placement and order of that particular box within the file.
This superbox is composed out of the following core elements:
— a JPEG Pleno Light Field Header box containing parameterization information about the light field
such as size and colour parameters;
— a JPEG Pleno Light Field Reference View box containi
...

FINAL
INTERNATIONAL ISO/IEC
DRAFT
STANDARD FDIS
21794-2
ISO/IEC JTC 1/SC 29
Information technology — Plenoptic
Secretariat: JISC
image coding system (JPEG Pleno) —
Voting begins on:
2021-01-04
Part 2:
Voting terminates on:
Light field coding
2021-03-01
RECIPIENTS OF THIS DRAFT ARE INVITED TO
SUBMIT, WITH THEIR COMMENTS, NOTIFICATION
OF ANY RELEVANT PATENT RIGHTS OF WHICH
THEY ARE AWARE AND TO PROVIDE SUPPOR TING
DOCUMENTATION.
IN ADDITION TO THEIR EVALUATION AS
Reference number
BEING ACCEPTABLE FOR INDUSTRIAL, TECHNO-
ISO/IEC FDIS 21794-2:2021(E)
LOGICAL, COMMERCIAL AND USER PURPOSES,
DRAFT INTERNATIONAL STANDARDS MAY ON
OCCASION HAVE TO BE CONSIDERED IN THE
LIGHT OF THEIR POTENTIAL TO BECOME STAN-
DARDS TO WHICH REFERENCE MAY BE MADE IN
©
NATIONAL REGULATIONS. ISO/IEC 2021

---------------------- Page: 1 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

COPYRIGHT PROTECTED DOCUMENT
© ISO/IEC 2021
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may
be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting
on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address
below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
ii © ISO/IEC 2021 – All rights reserved

---------------------- Page: 2 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

Contents Page
Foreword .iv
Introduction .v
1 Scope . 1
2 Normative references . 1
3 Terms and definitions . 1
4 Symbols and abbreviated terms . 3
4.1 Symbols . 3
4.2 Abbreviated terms . 7
5 Conventions . 8
5.1 Naming conventions for numerical values . 8
5.2 Operators . 8
5.2.1 Arithmetic operators . 8
5.2.2 Logical operators . 9
5.2.3 Relational operators . 9
5.2.4 Precedence order of operators . 9
5.2.5 Mathematical functions .10
6 General .10
6.1 Functional overview on the decoding process .10
6.2 Encoder requirements .11
6.3 Decoder requirements.11
7 Organization of the document .11
Annex A (normative) JPEG Pleno Light Field superbox .12
Annex B (normative) 4D transform mode .29
Annex C (normative) JPEG Pleno light field reference view decoding .73
Annex D (normative) JPEG Pleno light field normalized disparity view decoding .81
Annex E (normative) JPEG Pleno Light Field Intermediate View superbox .89
Bibiliography .117
© ISO/IEC 2021 – All rights reserved iii

---------------------- Page: 3 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that
are members of ISO or IEC participate in the development of International Standards through
technical committees established by the respective organization to deal with particular fields of
technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other
international organizations, governmental and non-governmental, in liaison with ISO and IEC, also
take part in the work.
The procedures used to develop this document and those intended for its further maintenance are
described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for
the different types of document should be noted. This document was drafted in accordance with the
editorial rules of the ISO/IEC Directives, Part 2 (see www .iso .org/ directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject
of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent
rights. Details of any patent rights identified during the development of the document will be in the
Introduction and/or on the ISO list of patent declarations received (see www .iso .org/ patents) or the IEC
list of patent declarations received (see patents.iec.ch).
Any trade name used in this document is information given for the convenience of users and does not
constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and
expressions related to conformity assessment, as well as information about ISO's adherence to the
World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www .iso .org/
iso/ foreword .html.
This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information.
A list of all parts in the ISO/IEC 21794 series can be found on the ISO website.
Any feedback or questions on this document should be directed to the user’s national standards body. A
complete listing of these bodies can be found at www .iso .org/ members .html.
iv © ISO/IEC 2021 – All rights reserved

---------------------- Page: 4 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

Introduction
This document is part of a series of standards for a system known as JPEG Pleno. This document defines
the JPEG Pleno framework. It facilitates the capture, representation, exchange and visualization of
plenoptic imaging modalities. A plenoptic image modality can be a light field, point cloud or hologram,
which are sampled representations of the plenoptic function in the form of, respectively, a vector
function that represents the radiance of a discretized set of light rays, a collection of points with
position and attribute information, or a complex wavefront. The plenoptic function describes the
radiance in time and in space obtained by positioning a pinhole camera at every viewpoint in 3D spatial
coordinates, every viewing angle and every wavelength, resulting in a 7D function.
JPEG Pleno specifies tools for coding these modalities while providing advanced functionality at system
level, such as support for data and metadata manipulation, editing, random access and interaction,
protection of privacy and ownership rights.
© ISO/IEC 2021 – All rights reserved v

---------------------- Page: 5 ----------------------
FINAL DRAFT INTERNATIONAL STANDARD ISO/IEC FDIS 21794-2:2021(E)
Information technology — Plenoptic image coding system
(JPEG Pleno) —
Part 2:
Light field coding
1 Scope
This document specifies a coded codestream format for storage of light field modalities as well as
associated metadata descriptors that are light field modality specific. This document also provides
information on the encoding tools.
2 Normative references
The following documents are referred to in the text in such a way that some or all of their content
constitutes requirements of this document. For dated references, only the edition cited applies. For
undated references, the latest edition of the referenced document (including any amendments) applies.
ITU-T Rec. T.800 | ISO/IEC 15444-1, Information technology — JPEG 2000 image coding system — Part 1:
Core coding system
ITU-T Rec. T.801 | ISO/IEC 15444-2, Information technology — JPEG 2000 image coding system — Part 2:
Extensions
ISO/IEC 21794-1:2020, Information technology — Plenoptic image coding system (JPEG Pleno) — Part 1:
Framework
ISO/IEC 60559, Information technology — Microprocessor Systems — Floating-Point arithmetic
3 Terms and definitions
For the purposes of this document the terms and definitions given in ISO/IEC 21794-1 and the
following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https:// www .iso .org/ obp
— IEC Electropedia: available at http:// www .electropedia .org/
3.1
arithmetic coder
entropy coder that converts variable length strings to variable length codes (encoding) and vice versa
(decoding)
3.2
bit-plane
two-dimensional array of bits
3.3
4D bit-plane
four-dimensional array of bits
© ISO/IEC 2021 – All rights reserved 1

---------------------- Page: 6 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

3.4
coefficient
numerical value that is the result of a transformation or linear regression
3.5
compression
reduction in the number of bits used to represent source image data
3.6
depth
distance of a point in 3D space to the camera plane
3.7
disparity view
image that for each pixel of the subaperture view contains the apparent pixel shift between two
subaperture views along either horizontal or vertical axis
3.8
hexadeca-tree
division of a 4D region into 16 (sixteen) 4D subregions
3.9
pixel
collection of sample values in the spatial image domain having all the same sample coordinates
EXAMPLE A pixel may consist of three samples describing its red, green and blue value.
3.10
plenoptic function
amount of radiance in time and in space by positioning a pinhole camera at every viewpoint in 3D
spatial coordinates, every viewing angle and every wavelength, resulting in a 7D representation
3.11
reference view
subaperture view that is used as one of the references to generate the intermediate views
3.12
subaperture view
subaperture image
image taken of the 3D scene by a pinhole camera positioned at a particular viewpoint and viewing angle
3.13
texture
pixel attributes
EXAMPLE Colour information, opacity, etc.
3.14
transform
transformation
mathematical mapping from one signal space to another
2 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 7 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

4 Symbols and abbreviated terms
4.1 Symbols
Codestream_Body() coded image data in the codestream without Codestream_Header()
Codestream_Header() codestream header preceding the image data in the codestream

DEC
Dt,,sv,u
()
decoded normalized disparity value at view ts, for pixel location vu,
() ()


Dt,,sv,u
()
normalized disparity value at view ts, for pixel location vu,
() ()

DPEC
k pointer to contiguous codestream for normalized disparity view k

scaling parameter to translate quantized normalized disparity maps to pos-
D
shift
itive range
DCODEC disparity view codec type
f focal length

FPW
p fixed-weight merging parameter for view p

Ht,s
()
view hierarchy value for view ts,
()

HCCt,s
()
horizontal camera centre coordinate for view ts,
()

Ht,s
()
D
binary value defining the availability of a normalized disparity view ts,
()

J
0 Lagrangian encoding cost

J
1 Lagrangian encoding cost of spatial partitioning

J
2 Lagrangian encoding cost of view partitioning

KR
pc, sparse filter regressor mask of texture component c for view p
LightField() JPEG Pleno light field codestream

pc,
quantized least-squares merging weight of texture component c for view p ,
LSW
j
jN=…12,, , LS
p
MIDV absolute value of the minimum value over all quantized normalized disparity views
© ISO/IEC 2021 – All rights reserved 3

---------------------- Page: 8 ----------------------
ISO/IEC FDIS 21794-2:2021(E)


MMODE
p view merging mode for intermediate view p

MSP
p sparse filter order for view p

NLS
p number of least-squares merging coefficients for intermediate view p

NRT
p regressor template size parameter for sparse filter for view p
NC number of components in an image

N
I number of intermediate views

N
NDV number of reference normalized disparity views

D
N
number of normalized disparity reference views for intermediate view p
p

T
N
number of texture reference views for intermediate view p
p

N
REF number of reference views

N
RES number of prediction residual views

N
sp total available number of regressors for sparse filter
Plev level a particular codestream complies to
Ppih profile a particular codestream complies to

2D image of dimensions VU× , defines the occlusion state-based segmentation
Q
p
at Intermediate view p
Q normalized disparity quantization parameter
R rate or bitrate, expressed in bit per sample
RCODEC prediction residual view codec type
array of bytes containing for a single prediction residual view the RCODEC
RDATA
codestream after header information has been stripped
array of bytes containing for a single prediction residual view the full DCODEC
RENCODING
codestream
RGB colour data for the red, green and blue colour component of a pixel
array of bytes containing for a single prediction residual view the header infor-
RHEADER
mation from the RCODEC codestream
4 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 9 ----------------------
ISO/IEC FDIS 21794-2:2021(E)


RPEC
j pointer to contiguous codestream for prediction residual view j
s
coordinate of the addressed subaperture image along the s-axis
S
size of the light field image along the s-axis (COLUMNS)

T
Tr subscript of the column index of the reference view, ii=…12,, ,N in the light
s
p
ii
field array in row-wise scanning order

Dr
subscript of the column index of the reference normalized disparity view,
s
jj
D
jj=…12,, ,N in the light field array in row-wise scanning order
p

SF
p binary variable, determines if sparse filter is used (true) or not (false)

p,0
SPW
j quantized sparse filter coefficients of texture component c for view p , jM=…12,, , SP
p

pc,
de-quantized sparse filter coefficients of texture component c for view p ,

SPW
j
jM=…12,, , SP
p
t coordinate of the addressed subaperture image along the t-axis
T size of the light field image along the t-axis (ROWS)

T
Tr subscript of the row index of the reference view, ii=…12,, ,N in the light field
t
p
ii
array in row-wise scanning order

D
Dr
subscript of the row index of the reference normalized disparity view, jj=…12,, ,N
t
p
jj
in the light field array in row-wise scanning order

D D
ts,
()
view coordinate subscripts for normalized disparity view k
k k

X X
ts,
()
l l view coordinate subscripts for reference view l

I I
ts,
()
view coordinate subscripts for intermediate view p
p p

ts××vu×
kk kk 4D block dimensions at the 4D block partitioning stage

ts××vu×
bb bb 4D block dimensions at the bit-plane hexadeca-tree decomposition stage
TCODEC reference view codec type
array of bytes, containing for a single reference view, the TCODEC codestream,
TDATA
after header information has been stripped
© ISO/IEC 2021 – All rights reserved 5

---------------------- Page: 10 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

TENCODING array of bytes, containing for a single reference view the full TCODEC codestream
array of bytes, containing for a single reference view the header information
THEADER
from the TCODEC codestream

TPEC
l pointer to contiguous codestream for reference view l
u sample coordinate along the u-axis within the addressed subaperture image
U size of the subaperture image along the u-axis (WIDTH)
v sample coordinate along the v-axis within the addressed subaperture image
V size of the subaperture image along the v-axis (HEIGHT)

VCCt,s
()
vertical camera centre coordinate for view ts,
()

VPP
p view prediction parameters for intermediate view p

Xt,,sv,,uc
()
texture value at view ts, for pixel location vu, for texture component c
() ()

DEC
decoded texture value at view ts, for pixel location vu, for texture com-
() ()
Xt,,sv,,uc
()
ponent c

ts,
()
11
Xt ,s
()
result of warping the texture view ts, to view location ts,
W 22 () ()
11 22

Δx
horizontal distance between a pair of camera centres

Δy
vertical distance between a pair of camera centres
colour data for the luminance, the blue chrominance and the red chrominance
YCbCr
component of a pixel

zt,,sv,u
()
depth value at view ts, for pixel location vu,
() ()

T
p
distance based merging weight for reference view iN=…1,, at intermedi-
ˆ
θ
p
i
ate view p

p
distance based factor, used for defining the merging weight, at intermediate
α
i
T
view p for reference view iN=…1,,
p

binary matrix, defining the locations of the non-zero merging weights in merg-
Γ
ing weight matrix Θ at intermediate view p . It is identical between all colour
p
pc,
components c
6 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 11 ----------------------
ISO/IEC FDIS 21794-2:2021(E)


pc,
de-quantized least-squares merging weight of texture component c for view p
θ
j
, jN=…12,, , LS
p

sp
θ
pc, sparse filter coefficients at intermediate view p for colour component c

Θ
pc, merging weight matrix for intermediate view p for colour component c

Υ
pc,
locations of the non-zero elements of Ψ
vu,
()

Ψ
vu,
() regressor template at pixel location vu,
()

Dr
Ω
p set of reference normalized disparity views for intermediate view p

occlD
set of occluded pixels, which remain to be inpainted, during normalized dispar-
Ω
p
ity view synthesis at intermediate view p

occlT
set of occluded pixels, which remain to be inpainted, during texture view syn-
Ω
p
thesis at intermediate view p

Tr
Ω
p
set of reference views for intermediate view p
4.2 Abbreviated terms
2D two dimensional
3D three dimensional
4D four dimensional
DCT discrete cosine transform
floating pointfloating point notation as specified in ISO/IEC 60559
HTTP hypertext transfer protocol
IDCT inverse DCT
IPR intellectual property rights
IV intermediate view; subaperture view that is generated from surrounding reference view(s)
JPEG Joint Photographic Experts Group
JPL JPEG Pleno file format
LSB least significant bit
MSB most significant bit
© ISO/IEC 2021 – All rights reserved 7

---------------------- Page: 12 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

R-D rate-distortion
RV reference view
URL uniform resource locator
XML eXtensible Markup Language
5 Conventions
5.1 Naming conventions for numerical values
Integer numbers are expressed as bit patterns, hexadecimal values or decimal numbers. Bit patterns
and hexadecimal values have both a numerical value and an associated particular length in bits.
Hexadecimal notation, indicated by prefixing the hexadecimal number by "0x", may be used instead
of binary notation to denote a bit pattern having a length that is an integer multiple of 4. For example,
0x41 represents an eight-bit pattern having only its second most significant bit and its least significant
bit equal to 1. Numerical values that are specified under a "Code" heading in tables that are referred to
as "code tables" are bit pattern values (specified as a string of digits equal to 0 or 1 in which the left-
most bit is considered the most-significant bit). Other numerical values not prefixed by "0x" are decimal
values. When used in expressions, a hexadecimal value is interpreted as having a value equal to the
value of the corresponding bit pattern evaluated as a binary representation of an unsigned integer (i.e.
as the value of the number formed by prefixing the bit pattern with a sign bit equal to 0 and interpreting
the result as a two's complement representation of an integer value). For example, the hexadecimal
value 0xF is equivalent to the 4-bit pattern '1111' and is interpreted in expressions as being equal to the
decimal number 15.
5.2 Operators
NOTE Many of the operators used in document are similar to those used in the C programming
language.
5.2.1 Arithmetic operators
+ addition
− subtraction (as a binary operator) or negation (as a unary prefix operator)
× multiplication
/ division without truncation or rounding
s
<< left shift; x< s
>> right shift; x>>s is defined as ⎿x/2 ⏌
++ increment with 1
-- decrement with 1
8 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 13 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

umod x umod a is the unique value y between 0 and a–1
for which y+Na = x with a suitable integer N
& bitwise AND operator; compares each bit of the first operand to the corresponding bit
of the second operand
If both bits are 1, the corresponding result bit is set to 1. Otherwise, the corresponding
result bit is set to 0.
^ bitwise XOR operator; compares each bit of the first operand to the corresponding bit
of the second operand
If both bits are equal, the corresponding result bit is set to 0. Otherwise, the correspond-
ing result bit is set to 1.
5.2.2 Logical operators
|| logical OR
&& logical AND
! logical NOT
5.2.3 Relational operators
> greater than
>= greater than or equal to
< less than
<= less than or equal to
== equal to
!= not equal to
5.2.4 Precedence order of operators
Operators are listed in descending order of precedence. If several operators appear in the same line,
they have equal precedence. When several operators of equal precedence appear at the same level in an
expression, evaluation proceeds according to the associativity of the operator either from right to left
or from left to right.
Operators Type of operation Associativity
() expression left to right
[] indexing of arrays left to right
++, -- increment, decrement left to right
!, – logical not, unary negation
© ISO/IEC 2021 – All rights reserved 9

---------------------- Page: 14 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

×, / multiplication, division left to right
umod modulo (remainder) left to right
+, − addition and subtraction left to right
& bitwise AND left to right
^ bitwise XOR left to right
&& logical AND left to right
|| logical OR left to right
<<, >> left shift and right shift left to right
< , >, <=, >= relational left to right
5.2.5 Mathematical functions
|x| absolute value, is –x for x < 0, otherwise x
sign(x) sign of x, zero if x is zero, +1 if x is positive, -1 if x is negative
clamp(x,min,max) clamps x to the range [min,max]: returns min if x < min, max if x > max or
otherwise x
⎾x⏋ ceiling of x; returns the smallest integer that is greater than or equal to x
⎿x⏌ floor of x; returns the largest integer that is less than or equal to x
⎿x⏋
rounding of x to the nearest integer, equivalent to sign xx+05.
()
6 General
6.1 Functional overview on the decoding process
This document specifies the JPEG Pleno Light Field superbox and the JPEG Pleno light field decoding
algorithm. The generic JPEG Pleno Light Field superbox syntax is specified in Annex A.
The specified light field decoding algorithm distinguishes two coding modes:
— 4D Transform mode: this mode is specified in Annex B and is based on a 4D inverse discrete cosine
transform (IDCT) and 4D block partitioning and 4D bit-plane hexadeca-tree decoding;
— 4D Prediction mode: this mode is based the prediction of intermediate views based on reference
views and normalized disparity maps. The signalling syntax and decoding of the reference views is
addressed in Annex C, the normalized disparity views in Annex D, and the prediction parameters
and residual views in Annex E. The intermediate views are reconstructed in a decoding process that
involves view warping, view merging and prediction error correction.
The overall architecture (Figure 1) provides the flexibility to configure the encoding and decoding
system depending on the requirements of the addressed use case.
10 © ISO/IEC 2021 – All rights reserved

---------------------- Page: 15 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

Figure 1 — Generic JPEG Pleno light field decoder architecture
6.2 Encoder requirements
An encoding process converts source light field data to coded light field data.
In order to conform with this document, an encoder shall conform with the codestream format syntax
and file format syntax specified in the annexes for the encoding process(es) embodied by the encoder.
6.3 Decoder requirements
A decoding process converts coded light field data to reconstructed light field data. Annexes A through
E describe and specify the decoding process.
A decoder is an embodiment of the decoding process. In order to conform to this document, a decoder
shall convert all, or specific parts of, any coded light field data that conform to the file format syntax
and codestream syntax specified in Annex A to E to a reconstructed light field.
7 Organization of the document
Annex A specifies the description of the JPEG Pleno Light Field superbox.
This document specifies two approaches to represent a compressed representation of light field data:
the 4D Transform mode is specified in Annex B and the 4D Prediction mode is specified Annex C,
Annex D and Annex E. Annex C details the signalling of the reference view data, Annex D the signalling
of the normalized disparity views and finally, Annex E the signalling of the prediction parameters to
generate the intermediate views and residual view data to compensate for prediction errors.
© ISO/IEC 2021 – All rights reserved 11

---------------------- Page: 16 ----------------------
ISO/IEC FDIS 21794-2:2021(E)

Annex A
(normative)

JPEG Pleno Light Field superbox
A.1 General
This annex specifies the use of the JPEG Pleno Light Field superbox which is designed to contain
compressed light field data and associated metadata. The listed boxes shall comply with their definitions
as specified in ISO/IEC 21794-1.
This document may redefine the binary structure of some boxes defined as part of the ISO/IEC 15444-1
or ISO/IEC 15444-2 file formats. For those boxes, the definition found in this document shall be used for
all JPL files.
A.2 Organization of the JPEG Pleno Light Field superbox
Figure A.1 shows the hierarchical organization of the JPEG Pleno Light Field superbox contained by a
JPL file. This illustration does not specify nor imply a specific order to these boxes. In many cases, the
file will contain sev
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.