opencl_helpers_documentation-english.tex 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433
  1. \documentclass[%showframe%
  2. ]{oclh_doc}
  3. \usepackage{fix-cm}% fix of Computer Modern font DO NOT MOVE!!!
  4. \usepackage{calc}% simple arithmetic in expressions
  5. \usepackage{etoolbox}% some Latex interfaces
  6. \usepackage{longtable,multirow}% tables
  7. \usepackage{adjustbox,%
  8. placeins, % processing of float objects
  9. caption} %
  10. %
  11. % page geometry
  12. %
  13. \usepackage[
  14. top = 0.06734007\paperheight,
  15. bottom = 0.06734007\paperheight,
  16. right = 0.1190476\paperwidth,
  17. left = 0.0952381\paperwidth
  18. ]{geometry}
  19. %
  20. % language and fonts
  21. %
  22. \usepackage{amsmath,amsfonts,amssymb,xfrac}
  23. \usepackage[cmintegrals,cmbraces]{newtxmath}
  24. \usepackage{xltxtra,polyglossia,csquotes}
  25. \usepackage{verbatim,fancyvrb,framed}
  26. \usepackage{relsize}
  27. \setmainlanguage{english}
  28. \setotherlanguage{russian}
  29. \setkeys{russian}{babelshorthands=true}
  30. \defaultfontfeatures{Scale=MatchLowercase,Mapping=tex-text}
  31. \input{fonts/font_settings-IBM_Plex.tex}
  32. % \input{fonts/font_settings-Computer_Modern.tex}
  33. % \input{fonts/font_settings-my_fonts.tex}
  34. %
  35. % paragraphs and align
  36. %
  37. \usepackage{indentfirst}
  38. \frenchspacing\sloppy\raggedbottom
  39. \setlength{\parindent}{0.0604762\paperwidth}%
  40. %%%%% additional colontitles settings
  41. \RequirePackage{fancyhdr}
  42. \fancyhf{}\renewcommand{\headrulewidth}{0pt}
  43. \fancyfoot{}
  44. \fancyfoot[R]{\thepage}
  45. \pagestyle{fancy}
  46. %
  47. % index
  48. %
  49. \usepackage[xindy]{imakeidx}
  50. \makeindex[options = -C utf8 -M texindy -M index_style_and_order ]
  51. %
  52. % references
  53. %
  54. \RequirePackage{hyperref}
  55. \RequirePackage{xcolor}
  56. \definecolor{DarkBlue}{rgb}{0,0,0.5}
  57. \hypersetup{colorlinks=true, linkcolor={black},
  58. urlcolor =DarkBlue, citecolor={black}}
  59. %
  60. % lists
  61. %
  62. \usepackage{enumitem}
  63. \newenvironment{CodePar}%
  64. {\Verbatim[samepage=true,frame=single]}%
  65. {\endVerbatim}%
  66. %
  67. \newenvironment{CodeParWithCC}[1]
  68. {\Verbatim[samepage=true,frame=single,commandchars=#1]}
  69. {\endVerbatim}
  70. %
  71. \newenvironment{ImpNote}
  72. {\setlength{\parindent}{0.0604762\paperwidth}%
  73. \setlength{\LTleft}{\parindent}%
  74. \setlength{\LTpre}{\parsep}\setlength{\LTpost}{\parsep}%
  75. \begin{longtable}{|p{\linewidth-\tabcolsep-1\parindent}}
  76. \noindent\textsf{Important warning:}}
  77. {\end{longtable}}
  78. %
  79. \newcommand{\SetLenVarWithWidth}[2]{%
  80. \ifdefined #1 \else%
  81. \newlength{#1}%
  82. \fi%
  83. \settowidth{#1}{#2}%
  84. }%
  85. \newcommand{\NmCnvDescript}%
  86. {\addtolength{\leftskip}{0.0604762\paperwidth}%
  87. \setlength{\parindent}{-0.0604762\paperwidth}}%
  88. %
  89. \newcommand{\verbI}[1]{\textit{\Verb|#1|}}%
  90. \newcommand{\verbIU}[1]{\underline{\smash{\textit{\Verb|#1|}}}}
  91. \title{OpenCL\_helpers library}
  92. \author{hk@r4in.tk\\ mns@r4in.tk}
  93. \makeindex
  94. \begin{document}
  95. \maketitle
  96. \section{Introduction}
  97. The OpenCL\_helpers library is designed to simplify the programming of
  98. multithreaded applications using GPGPU (General-purpose computing on graphics
  99. processing units). The library does not cover all of needs of the programming of
  100. applications using GPGPU, but it had been written to simplify the making of
  101. applications using multilevel (hierarchical) parallelism on one computer. In
  102. other words, it allows to parallelise the task into threads on a main computer
  103. (CPU) and than parallelise the task inside each GPGPU-device, dividing
  104. GPU-threads into squads performing different subtasks.
  105. While designing the library it was assumed that each CPU-thread uses its own
  106. separate GPGPU-device, but it is not prohibited to use the same GPGPU-device in
  107. two or more CPU-threads. Parallelism at the CPU level is provided by POSIX
  108. Threads and parallelism at the GPGPU level is provided by the library.
  109. The library consists of four parts, not all of which are directly related to
  110. parallelism and which could be used independently of each other.
  111. The first part is OpenCL programs build tools~(s.\ref{sec:buildutils}). Build
  112. tools allow to build OpenCL programs from the command line and view the build
  113. result without writing and executing another application.
  114. The second part~(s.\ref{sec:libraryusing}) is the CPU-functions of the
  115. OpenCL\_helpers library and syntax which allows to make header-files common for
  116. CPU and GPGPU programs.
  117. The third part~(s.\ref{sec:memalloc}) makes up for the lack of memory management
  118. tools in OpenCL C and provides instruments for GPGPU memory allocation and
  119. deallocation, heap diagnostics and pointer reinterpretation.
  120. The fourth part is the GPGPU-functions which allow to organize parallelism
  121. inside GPGPU, dividing whole amount of GPGPU-threads into squads engaged in
  122. performing their own tasks.
  123. \subsection{Build and install}
  124. \label{sec:build_n_install}
  125. \subsubsection{Prerequisites}
  126. \label{subsec:prerequisites}
  127. It's supposed that you have:
  128. \begin{enumerate}[leftmargin=2\parindent]
  129. \item A computer with OS Linux installed.
  130. \item An installed and ready to work C compiler.
  131. \item The installed and available standard C language library (libc).
  132. \item An installed OpenCL software of version 1.2 or later from any vendor.
  133. \end{enumerate}
  134. \subsubsection{Getting the source code of the OpenCL\_helpers library}
  135. \label{subsec:getting_source}
  136. A copy of the source code of the OpenCL\_helpers library can be downloaded from
  137. the address
  138. \href{https://ggs.void.r4in.tk/hk/OpenCL_helpers/archive/master.tar.gz}%
  139. {\Verb|https://ggs.void.r4in.tk/hk/OpenCL\_helpers/archive/master.tar.gz|}
  140. than unpacked into a suitable directory.
  141. Besides of that, if VCS Git~(\href{https://git-scm.com}%
  142. {\Verb|https://git-scm.com|}) is installed, a copy
  143. of the source code of the library could be obtained with the following command:
  144. \par
  145. \indent\indent\verb|git clone |%
  146. \href{https://ggs.void.r4in.tk/hk/OpenCL_helpers.git}%
  147. {\Verb|https://ggs.void.r4in.tk/hk/OpenCL\_helpers.git|}
  148. \subsubsection{Build}
  149. To build with default settings it's necessary to navigate to the
  150. \verb|OpenCL_helpers| directory and run the command\par
  151. \indent\indent\verb|make|\par
  152. \noindent%
  153. It is acceptable to use option \verb|-j| for multithreaded build. If the
  154. \verb|make| command completed without errors, the following files would appear
  155. in the \verb|OpenCL_helpers/build| directory:\par
  156. \indent\indent\verb|liboclh.so.|\verbI{I}\verb|.|\verbI{J}%
  157. \verb| oclh_br oclh_cr oclh_lr|\par
  158. \noindent%
  159. and a few \verb|*.o| subdirectories containing object files. In the name of the
  160. first file \verbI{I} is the major version of the library and \verbI{J} is the
  161. minor version.
  162. Each file could be built separately with commands:\par
  163. \indent\indent\verb|make oclh_library|\par
  164. \indent\indent\verb|make oclh_builder|\par
  165. \indent\indent\verb|make oclh_compiler|\par
  166. \indent\indent\verb|make oclh_linker|\par
  167. If it was necessary, the library could be built for debugging with the command:
  168. \par
  169. \indent\indent\verb|make debug|
  170. \subsubsection{Installation}
  171. Installation is performed by the command:\par
  172. \indent\indent\verb|make install|\par
  173. \noindent%
  174. As the result the \verb|~/opt/oclh| directory is created, where executable
  175. files, the library file and header files are copied into the \verb|bin|,
  176. \verb|lib|, \verb|include| subdirectories respectively. After that it is
  177. advisable to add the \verb|~/opt/oclh/bin| directory to the \verb|PATH|
  178. environment variable and the \verb|~/opt/oclh/lib| directory to the
  179. \verb|LD_LIBRARY_PATH| environment variable.
  180. The destination path can be changed with the command:\par
  181. \indent\indent\verb|make PRFX_PATH=|\verbI{destination\_path}\verb| install|
  182. \subsubsection{Uninstallation}
  183. Uninstallation is performed by the command:\par
  184. \indent\indent\verb|make uninstall|\par
  185. \noindent or\par
  186. \indent\indent\verb|make PRFX_PATH=|\verbI{destination\_path}\verb| uninstall|
  187. \par
  188. \noindent%
  189. if the library had been installed in a non-default directory.
  190. \subsubsection{Documentation}
  191. The documentation of the OpenCL\_helpers library is built separately. To build
  192. the documentation, it's necessary to have the \XeTeX /\XeLaTeX\ typesetting
  193. system or another \TeX /\LaTeX-compatible system. The \XeTeX\ system and related
  194. packages are provided within the \TeX
  195. ~Live~distribution~(\href{https://www.tug.org/texlive/}%
  196. {\Verb|https://www.tug.org/texlive/|}). Using a system other than \XeTeX\ may
  197. require changes in the source code of the documentation.
  198. In addition to the typesetting system itself, it's necessary to have a number of
  199. packages, for example, xindy for composition of the index. All packages used for
  200. preparation of the documentation are freely available as part of the \TeX
  201. ~Live~distribution.
  202. The documentation build itself is performed in the
  203. \verb|OpenCL_helpers/documentation| directory by running build script\par
  204. \indent\indent\verb|./build_script|\par
  205. \noindent%
  206. If no errors occurred during the execution of the current script, the following
  207. files would appear in the \verb|OpenCL_helpers/documentation/build| directory:
  208. \par
  209. \indent\indent\verb|opencl_helpers_documentation-russian.pdf|\par
  210. \indent\indent\verb|opencl_helpers_documentation-english.pdf|\par
  211. \noindent%
  212. which contain the documentation in Russian and English languages, respectively.
  213. The build uses fonts of the IBM~Plex family, but it is possible to return to the
  214. basic Computer~Modern family by uncommenting the line\par
  215. \indent\indent\verb|\input{fonts/font_settings-Computer_Modern.tex}|\par
  216. \noindent in the preamble of the documentation source code.
  217. \subsection{Log file format}
  218. \label{subsec:logformat}
  219. \index{log file format}%
  220. The library tools allow maintain a log in log files about events occurring in
  221. an application, in addition, the library itself, if necessary, writes to the
  222. log file. Description of logging functions is given
  223. in~s.\ref{subsec:logfunctions}.
  224. The standard log file entry looks like\par
  225. \begin{CodeParWithCC}{\\\{\}}
  226. \textit{YYYY}-\textit{MM}-\textit{DD} \textit{hh}:\textit{mm}:\textit{ss} ws_0x\textit{HHHH} \textit{entry\_content}
  227. \end{CodeParWithCC}
  228. \noindent%
  229. where
  230. {%
  231. \setlength{\leftskip}{0pt}%
  232. \setlength{\LTpre}{\smallskipamount}\setlength{\LTpost}{\smallskipamount}%
  233. \setlength{\LTleft}{2\parindent-\tabcolsep}
  234. \SetLenVarWithWidth{\Acol}{\verbI{YYYY}}%
  235. \SetLenVarWithWidth{\Bcol}{--}%
  236. \begin{longtable}%
  237. {p{\Acol}p{\Bcol}p{\linewidth-\LTleft-\Acol-\Bcol-5\tabcolsep}}
  238. \verbI{YYYY}&--&year written in four decimal digits;\\
  239. \verbI{MM}&--&month of the year written in two decimal digits from 01 to 12;\\
  240. \verbI{DD}&--&day of the month written in two decimal digits from 01 to 31;\\
  241. \verbI{hh}&--&hour of the day written in two decimal digits from 00 to 23;\\
  242. \verbI{mm}&--&minute of the hour written in two decimal digits from 00 to 59;\\
  243. \verbI{ss}&--&second of the minute written in two decimal digits from 00 to 59;
  244. \\
  245. \verbI{HHHH}&--&the last two bytes of an address of the working configuration of
  246. the GPGPU device (workset, for details see s.\ref{subsec:structures}) written in
  247. four hexadecimal digits.
  248. \end{longtable}
  249. }
  250. \noindent%
  251. \verbI{entry\_content}~--~can be any text which passed to a logging function,
  252. but the library itself obeys, if possible, the next conventions:
  253. \begin{enumerate}
  254. \item Information related to OpenCL instances is recorded as
  255. \verbI{instance\_type}\verb|_0x|\verbI{HHHH}, where \verbI{HHHH}~--~the last
  256. two bytes of the instance address, written in four hexadecimal digits. So, for
  257. example, a GPGPU device could be recorded as \verb|dev_0x2a78|, and a platform
  258. as \verb|platform_0xf190|. An exhaustive list of OpenCL instances is given in
  259. the OpenCL specifications.
  260. \item As a delimiter of information blocks in the entries and marking the
  261. relativity of such blocks, the symbol <<\verb+|+>> is used. So, the
  262. entry\nopagebreak
  263. \begin{CodePar}
  264. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | dev_0xf260 | ...
  265. \end{CodePar}
  266. means that entry describes event related to OpenCL context \verb|0x9f60| using
  267. GPGPU device \verb|0xf260|.
  268. \item In case of recording information that is an explicitation, an additional space is put before it, for example:\nopagebreak
  269. {\scriptsize
  270. \begin{CodePar}
  271. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | Reference count: 1
  272. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | Number of devices: 1
  273. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | Device ID(s): 0x1acf260
  274. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | dev_0xf260 | GPU: 15 units/17...
  275. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | dev_0xf260 | Memory: 8116.43...
  276. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | dev_0xf260 | Vendor: NVIDIA Corp...
  277. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | dev_0xf260 | Model: GeForce GT...
  278. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | Context properties:
  279. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | Platform: 0xf190
  280. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | platform_0xf190 | Profile: FULL_PROFILE
  281. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | platform_0xf190 | Version: OpenCL 1...
  282. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | platform_0xf190 | Name: NVIDIA CUDA
  283. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | platform_0xf190 | Vendor: NVIDIA Corp...
  284. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | platform_0xf190 | Extensions: cl_khr...
  285. 2019-06-03 15:42:47 ws_0x9c00 context_0x9f60 | Is user responsible for sync: Undefined (presumable No)
  286. \end{CodePar}
  287. }
  288. \item If an error occurred during the execution of the library function, there
  289. would be added to the log an entry starting with \verb|oclerr:| and containing information about all function calls from the library to the OpenCL API. So, the entry\nopagebreak
  290. \begin{CodeParWithCC}{\\\{\}}
  291. \textit{YYYY}-\textit{MM}-\textit{DD} \textit{hh}:\textit{mm}:\textit{ss} ws_0x\textit{HHHH} oclerr:
  292. _ghf_getBuildStatus/clGetProgramBuildInfo/CL_PROGRAM_BUILD_STATUS
  293. returned error -3 - CL_COMPILER_NOT_AVAILABLE
  294. \end{CodeParWithCC}
  295. means that the \verb|_ghf_getBuildStatus| function called the OpenCL API
  296. function \verb|clGetProgramBuildInfo| with the argument
  297. \verb|CL_PROGRAM_BUILD_STATUS| and received as a response the \verb|-3| error
  298. code, which stands for \verb|CL_COMPILER_NOT_AVAILABLE|.
  299. \end{enumerate}
  300. Given that OpenCL instance addresses are unique for one application run, it is
  301. highly likely that the combination of the name of the instance and the last two
  302. bytes of its address is also unique. Therefore, the use of these conventions
  303. allows, with substring filtering, obtain the necessary information from the log
  304. file for a particular OpenCL instance.
  305. In addition to the standard log entry there is also the header entry, which
  306. looks like\par
  307. {\small%
  308. \begin{CodeParWithCC}{\\\{\}}
  309. \textit{YYYY}-\textit{MM}-\textit{DD} \textit{hh}:\textit{mm}:\textit{ss} ws_0x\textit{HHHH} __________
  310. \textit{YYYY}-\textit{MM}-\textit{DD} \textit{hh}:\textit{mm}:\textit{ss} ws_0x\textit{HHHH} \textit{Title_text}
  311. \textit{YYYY}-\textit{MM}-\textit{DD} \textit{hh}:\textit{mm}:\textit{ss} ws_0x\textit{HHHH} ~~~~~~~~~~
  312. \end{CodeParWithCC}
  313. }\par
  314. \noindent and the delimiter entry, which looks like\par
  315. {\small%
  316. \begin{CodeParWithCC}{\\\{\}}
  317. \textit{YYYY}-\textit{MM}-\textit{DD} \textit{hh}:\textit{mm}:\textit{ss} ws_0x\textit{HHHH} ____________________________________________________
  318. \textit{YYYY}-\textit{MM}-\textit{DD} \textit{hh}:\textit{mm}:\textit{ss} ws_0x\textit{HHHH} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  319. \end{CodeParWithCC}
  320. }
  321. \input{name_conventions-english.tex}
  322. \section{OpenCL programs build tools}
  323. \label{sec:buildutils}
  324. The library includes three executable files:\nopagebreak\par
  325. \begin{itemize}[leftmargin=1.75\parindent]
  326. \item\verb|oclh_cr| -- compiles an OpenCL program into an OpenCL object;
  327. \item\verb|oclh_lr| -- links OpenCL objects;
  328. \item\verb|oclh_br| -- completly builds an OpenCL program.
  329. \end{itemize}\nopagebreak\par
  330. During the execution of these programs, a detailed diagnostic log is being
  331. maintained in the \verb|oclh_*r.log| file (according to the name of the tool),
  332. where excessive information is stored on all available GPGPU devices, used
  333. platforms, and contexts created for build. In fact, you can run, for example,
  334. \verb|oclh_сr| with any input file, even with itself as
  335. \verb|./oclh_сr oclh_сr|. The input file, of course, will not be built into an
  336. OpenCL object, but the \verb|oclh_сr.log| log file will contain complete
  337. information on GPGPU devices found in the system. The log file format is
  338. human-readable, adapted to search for substrings using the \verb|grep| command
  339. and analogues. The log file format is described~in~s.\ref{subsec:logformat}.
  340. Let us take a look at the use cases for each of these tools.
  341. \input{tools_compiler-english.tex}
  342. \input{tools_linker-english.tex}
  343. \input{tools_builder-english.tex}
  344. \section{Using the OpenCL\_helpers library. Structures, functions and headers}
  345. \label{sec:libraryusing}
  346. Stub. The section will be completed after sufficient testing of functionality.
  347. \subsection{Structures}
  348. \label{subsec:structures}
  349. Stub. The section will be completed after sufficient testing of functionality.
  350. \subsubsection{Main structure of the working configuration}
  351. \label{subsec:workset}
  352. Stub. The section will be completed after sufficient testing of functionality.
  353. \subsection{Logging functions}
  354. \label{subsec:logfunctions}
  355. Stub. The section will be completed after sufficient testing of functionality.
  356. \subsection{Common header files for CPU and GPGPU code}
  357. \label{subsec:sharedheaders}
  358. Stub. The section will be completed after sufficient testing of functionality.
  359. \section{Memory management and pointer reinterpretation in OpenCL C programs}
  360. \label{sec:memalloc}
  361. Stub. The section will be completed after sufficient testing of functionality.
  362. \section{Parallelism inside GPU}
  363. \label{sec:squadmodel}
  364. Stub. The section will be completed after sufficient testing of functionality.
  365. \let\originalstyle=\thispagestyle
  366. \def\thispagestyle#1{}
  367. \printindex
  368. \let\thispagestyle=\originalstyle
  369. \tableofcontents
  370. \end{document}