commit b72c2c33a1df9d77d67fb77d52ea7c67a02e379f
parent 17839623927e816114d4f6858520958748d0ede2
Author: tongong <tongong@gmx.net>
Date: Sat, 2 Jul 2022 12:19:46 +0200
readme update & bugfixes
Diffstat:
9 files changed, 93 insertions(+), 25 deletions(-)
diff --git a/README.md b/README.md
@@ -2,15 +2,20 @@
`tacker` takes your files and staples them together. The goal of this project
is to be a simple web bundler independent of the disaster that is the modern
-npm ecosystem. Advanced bundling and optimization techniques are not in scope
+npm ecosystem. The main use case of `tacker` is bundling single page
+applications into a single `.html` file for easier distribution. You can of
+course also use it to quickly get access to modularity when developing
+userscripts, to inline images into your static page, etc.
+
+Advanced bundling and optimization techniques are not in scope
of `tacker` - try one of the bloated mainstream bundlers instead:
- webpack (75 dependencies)
- parcel (184 dependencies)
- browserify (175 dependencies)
- ...
-`tacker` was written as an experimental project in the new `hare` programming
-language.
+`tacker` was written as an experimental project in the new
+[hare programming language](https://harelang.org/).
## features
- entrypoints:
@@ -23,11 +28,25 @@ language.
- other style sheets (`@import url(...)`)
- binary data as base64 (e.g. `background-image: url(...)`)
- JS
- - a subset of CommonJS modules
+ - a subset of CommonJS modules (see below for important drawbacks)
- `require(...)`
- `module.exports` and `exports`
- - binary data as base64 through custom `requireBinary(...)` function
+The "conceptual module name space root" is the working directory. This means
+that required paths which are not relative are resolved from the cwd. For
+security reasons only files in the cwd can be bundled. This can be changed with
+the `-p` option. Input and output file name stay relative to the cwd. The
+`.js` in `require()` imports is optional.
+
+`tacker` does not aim to be 100% spec-compliant. The goal is to work in all
+common scenarios without laying to much emphasis on obscure edge cases. It is a
+tacker after all - not an industrial robot. Though unlike a real-world tacker
+your security should not be at hazard. Malicious source files can obviously
+take over your bundled page but they can never take over your system.
+
+## known bugs & missing features
+
+### require()
CommonJS was chosen out of personal preference and its simplicity compared to
ES Modules (tree-shaking optimizations enabled by ES Modules would not be
implemented either way). The parser is rather simple though. To confirm to the
@@ -37,16 +56,42 @@ value and not special syntax the same is the case for the whole program as
every function could be possibly rebound to `require`. This requires the
complete execution of the program at bundle-time to be able to reason about
possible aliases to `require`. This is impossible and thus `require()` will be
-treated as special syntax. This implementation is thus wrong but should work
-for every sane usage of `require()`.
+treated as special syntax. This implementation (and in fact every CommonJS
+bundler) is thus wrong but should work for every sane usage of `require()`.
-The "conceptual module name space root" is the working directory. This means
-that required paths which are not relative are resolved from the cwd. For
-security reasons only files in the cwd can be bundled. This can be changed with
-the `-p` option. Input and output file name stay relative to the cwd.
+The `require()` macro expects a string literal with single or double quotes as
+single argument. Whitespace between `(` and `"`/`'` or between `"`/`'` and `)`
+is forbidden. Currently no escape sequences are allowed as this would add a lot
+of complexity and is not needed for sane file names. This feature may be added
+in the future.
-`tacker` does not aim to be 100% spec-compliant. The goal is to work in all
-common scenarios without laying to much emphasis on obscure edge cases. It is a
-tacker after all - not an industrial robot. Though unlike a real-world tacker
-your security should not be at hazard. Malicious source files can obviously
-take over your bundled page but they can never take over your system.
+Correctly expanding the `require()` macro requires recognizing string
+literals (to not cause bugs by changing string content). This in turn requires
+correctly recognizing regex literals as they could contain quote characters and
+as far as I know this requires parsing the whole AST (how to decide if `/5/` is
+a regex or part of an arithmetic expression?). A similar problem arises for
+template literals. To avoid this complexity `tacker` only reads until reaching
+the first string, regex or template literal. This means that module imports
+have to be at the top of each source file which is the case already for most
+projects. All potentially skipped `require()` calls will be announced as a
+warning.
+
+### script end tags & regex literals
+When inlining javascript in html, the script cannot contain script end tags
+(`</script>`). To handle this all occurrences of `</script` will be replaced by
+`<\/script`. This works in string literals and comments and should never occur
+in normal code. I am however not sure about regex literals - there could be
+very rare edge cases where these break.
+
+### external resources
+Bundling of external scripts, images, etc. is currently forbidden and `tacker`
+will throw an error. There are two alternative behaviors:
+
+1. Bundling the external resource: I think it is a bad idea to bundle random
+ assets from the internet.
+2. Allowing references to external resources without bundling them: This would
+ be a better way of handling external resources but it creates a runtime
+ dependency which is not very sustainable considering link rot.
+
+It would be possible to enable behavior (2) via a command argument flag but I
+currently do not see the point in implementing this feature.
diff --git a/bundle_html.ha b/bundle_html.ha
@@ -55,7 +55,7 @@ fn tacker_html(inputpath: str, ofile: io::handle) void = {
const src = resolve_path(src,
inputpath);
defer free(src);
- tacker_js(src, ofile);
+ tacker_js(src, ofile, true);
fmt::fprint(ofile, "</script>")!;
};
} else if (m == 2) {
@@ -86,7 +86,7 @@ fn tacker_html(inputpath: str, ofile: io::handle) void = {
const href = resolve_path(href,
inputpath);
defer free(href);
- tacker_js(href, ofile);
+ tacker_css(href, ofile);
fmt::fprint(ofile, "</style>")!;
};
} else {
diff --git a/bundle_js.ha b/bundle_js.ha
@@ -2,7 +2,11 @@ use fmt;
use io;
use os;
-fn tacker_js(inputpath: str, ofile: io::handle) void = {
+// html: true if the output can be inlined in a html script tag. This is
+// important because code like e.g.
+// let tag = "</script>";
+// has to be escaped.
+fn tacker_js(inputpath: str, ofile: io::handle, html: bool) void = {
const ifile = os::open(inputpath)!;
defer io::close(ifile)!;
// TODO
diff --git a/main.ha b/main.ha
@@ -51,12 +51,12 @@ export fn main() void = {
let extstart = lastdotindex(ifile);
if (extstart == -1)
- fmt::fatalf("file \"{}\" has broken filetype.", ifile);
+ fixed_fatalf("file \"{}\" has broken filetype.", ifile);
let ext = strings::fromutf8(strings::toutf8(ifile)[(extstart + 1)..]);
switch (ext) {
case "html" => tacker_html(ifile, ofile);
- case "js" => tacker_js(ifile, ofile);
+ case "js" => tacker_js(ifile, ofile, false);
case "css" => tacker_css(ifile, ofile);
case => fixed_fatalf("unknown filetype: \"{}\".", ifile);
};
diff --git a/path_helpers.ha b/path_helpers.ha
@@ -1,4 +1,3 @@
-use fmt;
use fs;
use os;
use slices;
@@ -23,7 +22,8 @@ fn realpath_resolve(path: str) str = {
const p = match (os::realpath(path)) {
case let p: str => yield p;
case let p: fs::error =>
- fmt::fatalf("path \"{}\" does not exist.", path);
+ fixed_fatalf("path \"{}\" does not exist.", path);
+ yield ""; // unreachable
};
return os::resolve(p);
};
@@ -32,7 +32,12 @@ fn realpath_resolve(path: str) str = {
// from: path to the file (or directory) where the reference was found.
// Return value has to be freed.
fn resolve_path(path: str, from: str) str = {
- // directory path is relativ to
+ if (strings::hasprefix(path, "http://") ||
+ strings::hasprefix(path, "https://")) {
+ fixed_fatalf("bundling of external resources is not allowed: \"{}\".",
+ path);
+ };
+ // directory path is relativ to base
// ends with "/"
const base = if (strings::hasprefix(path, "./") ||
strings::hasprefix(path, "../")) {
@@ -44,7 +49,7 @@ fn resolve_path(path: str, from: str) str = {
defer free(r);
const r = strings::dup(realpath_resolve(r));
if (!strings::hasprefix(r, basepath))
- fmt::fatalf("file path \"{}\" violates the base path \"{}\".",
+ fixed_fatalf("file path \"{}\" violates the base path \"{}\".",
r, basepath);
return r;
};
diff --git a/test-page/a.js b/test-page/a.js
@@ -1,4 +1,10 @@
// let testm = require("./b.js")
// console.log(testm.hello());
+let r = "this require('b.js') will not be macro-expanded.";
console.log("hi from an imported script!");
+
+function a() {
+ // this should throw a warning
+ console.log(require("test"));
+};
diff --git a/test-page/b.js b/test-page/b.js
@@ -0,0 +1,4 @@
+module.exports = {
+ hello: () => ":)",
+ c: require("./c"),
+}
diff --git a/test-page/c.js b/test-page/c.js
@@ -0,0 +1,2 @@
+console.log(require("./a.js"));
+exports.msg = ":)";
diff --git a/test-page/index.html b/test-page/index.html
@@ -13,5 +13,7 @@
<h1>test page</h1>
a nice image:
<img src=./example.png alt="example image"/>
+ <!-- uncomment to test external resources ban -->
+ <!-- <img src="https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png"/> -->
</body>
</html>